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PREFACE 

i 

In  April  1976,  General  Research  Corporation  (GRC)  began  a study  of 
'^Life-Cycle  Costing  of  Major  Defense  System  Software  and  Computer 
Resources,^  Contract  F19628-76-C-0180.  The  purpose  was  to  assist  Air 
Force  Program  Offices  and  staff  agencies  In  estimating,  reporting  and 
controlling  the  life-cycle  costs  of  software.  The  study  was  performed 
under  the  direction  of  the  Electronic  Systems  Division  (AFSC) , Computer 
Systems  Engineering  Office  (TOI)  • Captain  William  White  was  the  Project 
Officer  and  coordinator  of  the  data-collectlon  survey.  The  project  team 
wishes  to  thank  him  for  his  many  contributions. 

The  project  team  was  also  assisted  by  many  other  Individuals,  to 
whom  we  owe  our  thanks.  Among  them  the  following  Individuals  made 
possible  the  detailed  data  collection:  Captain  J.  M.  Hall,  Captain  Baden, 
and  Lt.  Anita  Cohen  of  SAMSO  (SCF) ; Ray  Erickson,  Joe  Thompson,  Chuck 
Chlodlnl,  and  Everton  Griffith  of  System  Development  Corporation;  and 
Eugene  Kelly,  Major  Dale  Wooldridge,  and  Thomas  Kennedy  of  the  US  Air 
Force  Data  Systems  Design  Center. 
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ABSTRACT 


This  technical  report  examines  the  costs  of  developing  and  main- 
taining computer  software  for  major  defense  systems.  A process  model  Is 
described  which  depicts  the  relations  among  activities  and  phases  of  the 
software  life  cycle.  Identifies  the  product  and  cost  Information  that  Is 
normally  available,  and  specifies  the  milestones.  This  process  model  Is 
used  as  the  basis  for  selecting  the  elements  of  a software  cost  report- 
ing system.  The  suggested  reporting  system  also  Includes  descriptions 
of  the  final  product,  time  phasing  of  product  development,  a standardized 
list  of  Computer  Program  Components,  and  a standardized  list  of  labor 
categories. 

During  the  study,  data  was  collected  from  several  sources  Includ- 
ing the  following  Air  Force  organizations: 

Electronic  Systems  Division 
Avionics  Systems  Division 
Space  and  Missile  Systems  Organization 
Data  Systems  Design  Center 


Cost  estimating  relationships  for  each  phase  of  the  software  life 
cycle  are  explored,  using  the  process  model  and  the  data.  The  importance 
of  trade-offs  In  cost  between  phases  Is  demonstrated.  The  report  also 
contains  estimating  relationships  for  evaluating  the  cost  effects  of  soft- 
ware size,  computer  capacity  constraints,  programming  language,  and  changes 
in  requirements.  It  also  addresses  the  separation  of  two  activities,  error 
correction  and  product  Improvement,  during  the  maintenance  phase  of  the 
life  cycle.  Results  are  Integrated  with  other  software  coat  estimating 
techniques. 
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INTRODUCTION 


1.1  GOALS  OF  THE  STUDY 

The  two  main  goals  of  the  study  were: 

To  define  the  elements  of  a system  for  reporting  the  costs 
of  developing  and  maintaining  computer  software 

To  develop  estimating  relationships  for  the  resources  (man- 
hours, computer-hours,  elapsed  time)  consumed  In  the  differ- 
ent phases  of  the  software  life  cycle. 

These  goals  reflect  a critical  need  for  Improving  the  Air  Force's 
ability  to  estimate  the  cost  of  software. 

The  requirement  for  good  software  cost-estimating  techniques  has 
been  increasing  In  the  last  decade  as  defense  systems  Increased  In 
sophistication  and  In  their  dependence  upon  computers.  More  and  more  of 
the  defense  dollar  Is  going  Into  the  development  and  maintenance  of  soft- 
ware; as  Captain  Devenny  cit  es  In  his  thesis,^  "DoD  program  managers  will 
buy  an  estimated  three  billion  dollars  worth  of  software  In  1976." 

As  a result,  what  used  to  be  a minor  part  of  a system's  acquisition 
cost,  requiring  no  more  than  gross  rule-of-thumb  estimating.  Is  now  a 
major  procurement  Item,  requiring  more  accurate  techniques.  Thus,  the 
goals  of  this  study  are  of  great  Importance. 

It  is  truly  unfortunate  that  the  first  goal  Is  still  a requirement. 

As  early  as  1966,  reporting  systems  were  being  devised  for  collecting 

2 

software  cost  data.  Welnwurm  described  the  data  elements  of  such  a 

system  In  that  year,  and  Nelson  and  Fleishman  expanded  Welnwurm 's  work 

3 

Into  a reporting  system.  Including  forms,  formats,  etc.  These  early 
efforts  were  companions  to  an  extensive  effort  at  System  Development  Cor- 
poration (SDC)  to  develop  cost  estimating  relationships  for  software. 
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More  recently,  Ronald  A.  Smith,  In  a study  for  Rome  Air  Development 
Center,  reported  on  a suggested  "Management  Data  Collection  and  Reporting 
System",^  prepared  for  RADC  in  October  1974.  The  report  Is  part  of  the 
structured  programming  series,  and  his  system  makes  use  of  the  library 
concept  Included  In  that  technique.  The  library  concept  Is  a powerful 
data  gathering  and  configuration  management  tool  that  makes  It  technically 
possible  to  gather  very  detailed  data  about  the  status  of  a software 
development. 

So  far,  however,  no  uniform  reporting  system  has  been  Implemented 
by  DoD.  As  a result,  "There  Is  no  widely  accessible  collection  of  cost 
data  which  can  be  applied  to  cost  estimation."  (Ref.  8,  page  11).  Con- 
sidering that  It  takes  10  to  15  years  for  some  large  defense  systems  to 
complete  their  life  cycle,  complete  cost  data  would  not  be  available  for 
a long  time,  even  If  a reporting  system  were  Implemented  today.  However, 
If  one  had  been  Implemented  In  1966,  useful  data  would  be  available  today. 

A more  Immediate  need  that  a reporting  system  can  fulfill  Is  to 
provide  the  Air  Force's  Program  Offices  with  better  information  for  cost 
control.  As  Devenny^  adequately  demonstrates  In  his  thesis,  software 
developments  rarely  finish  within  their  budgets.  One  major  reason  Is 
the  poor  quality  of  Information  available  during  the  development,  which 
makes  It  nearly  Impossible  to  spot  problems  at  early  stages. 

Why  has  no  uniform  system  for  data  collection  yet  been  Implemented? 
For  one  thing,  the  small  part  of  the  budget  devoted  to  software  In  the 
past  probably  discouraged  any  attempt  to  finance  such  a reporting  system. 
However,  we  speculate  that  a significant  reason  Is  the  unwillingness  of 
the  contractors  to  provide  data  In  as  much  detail  as  the  proposed  systems 
have  required.  What  a contractor  Is  technically  able  and  willing  to 
report  Internally  for  management  control,  e.g.,  by  using  the  Smith  system^ 
mentioned  above,  could  be  far  more  than  what  he  Is  willing  to  share 
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cheerfully  with  the  Government.  Thus,  care  should  be  taken  to  avoid  ask- 
ing for  excessive  detail  In  any  cost  reporting  system  that  one  hopes  to 
Implement. 

The  second  major  goal  of  this  study  was  to  Improve  the  general  under- 
standing of  software  cost  estimating.  This  too  has  had  a long  history  of 

q 

attempts  by  competent  groups  with  little  success.  As  Morin  noted  In  her 
review  of  software  cost  estimating  techniques,  "While  a number  of  efforts 
have  been  made  to  develop  improved  cost  estimating  techniques,  no  gener- 
ally accurate  nor  reliable  method  for  estimating  software  development  costs 
has  been  found." 

In  general,  previous  attempts  (by  SDC,^’^’^  Tecolote,^^  and  others) 
had  concentrated  on  total  cost.  Their  lack  of  success  was  not  due  to  a 
lack  of  competence:  many  imaginative  relationships  were  developed  and 
tested.  Unfortunately,  variances  remained  too  large  to  be  useful  for 
estimation,  as  witnessed  by  the  fact  that  these  groups  did  not  recommend 
the  use  of  their  findings  to  estimate  software  cost.^ 

In  an  effort  to  plow  new  ground,  we  were  directed  to  develop  resource 
estimating  relationships  (for  man-hours,  computer-hours,  and  elapsed  time) 
for  each  phase  of  the  life  cycle  separately,  rather  than  total  man-hours 
or  costs.  The  relationships  were  to  be  based  upon  currently  available 
data  and  developed  using  parametric  cost  estimating  techniques.  j 

i 

The  following  principles  evolved  during  the  course  of  the  study  and 
guided  our  progress  towards  achieving  the  study's  goals. 

First,  both  the  recommended  cost  reporting  system  and  the  estimating 
relationships  must  be  related  to  the  Air  Force's  procurement  process. 

Attempts  now  underway  to  standardize  this  process  are  reported  In 
AFR  800-14,^^  which  we  have  used  as  our  guide  in  developing  a process 
model,  tempered  by  visits  to  Program  Offices  and  by  our  own  experience. 


i 


w 
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Second , the  cost  reporting  system  must  meet  several  objectives. 

It  must  provide  information  which  will  help  the  Program  Offices  to  con- 
trol software  costs.  At  the  same  time  it  must  be  used  to  compile  cost 
data  on  many  systems,  using  standard  definitions,  so  that  better  cost 
estimating  relationships  can  eventually  be  developed.  Furthermore,  the 
system  must  not  be  so  detailed  that  it  is  impracticable,  and  it  must  be 
based  on  items  that  are  directly  measurable  by  contractors;  for  example, 
payroll  and  configuration-control  data. 

Third,  resource  estimating  relationships  for  man-hours,  computer- 
hours,  and  elapsed  time  should  be  developed  separately  for  each  phase 
of  the  software  life  cycle,  using  parametric  cost  estimating  techniques. 
The  relationships  should  depend  on  Inputs  normally  available  to  the  Air 
Force.  A relationship  which  depends  on  input  data  not  available  to  the 
Program  Office  (for  example,  competence  of  the  individual  programmer) 
might  be  statistically  valid,  but  would  be  useless  in  practice.  Since 
the  relationships  must  be  validated  by  using  currently  available  data, 
they  will  have  to  be  based  upon  less  detailed  information  than  that  pro- 
posed to  be  collected  with  the  reporting  system. 


1.2  PROGRESS  OF  THE  STUDY 

Five  tasks  were  identified  in  the  Statement  of  Work.  Task  1 was 
to  conduct  a literature  survey  to  develop  lists  of  the  human  and  computer 
resources  required  to  develop  and  maintain  software. 


As  will  be  demonstrated,  addressing  the  life-cycle  phases  separately  was 
not  sufficient;  the  trade-offs  between  phases  are  of  utmost  Importance 
in  determining  total  (and  phase)  resource  requirements.  Future  study  of 
these  relations  between  phases  should  also  lead  to  information  that  will 
identify  efficient  allocations  of  resources  among  the  phases.  It  is 
striking  that  the  literature  is  full  of  references  to  the  ”40-20-40" 
rule:  40Z  analysis  and  design,  20Z  coding  and  checkout,  and  40Z  integra- 
tion and  test.  Having  40Z  of  the  work  still  ahead  after  all  of  the  code 
has  been  written  makes  one  wonder  whether  enough  time  is  spent  in  analysis 
and  design.  Discovering  the  optimal  level  of  analysis  and  design  should 
go  a long  way  to  reduce  the  risk  associated  with  software  development. 
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Building  upon  this  review  and  our  own  experience  In  software  develop- 
ment. we  were  to  define  the  cost  elements  of  a recommended  reporting 
system  (Task  2)  and  relate  them  to  a typical  software  work  breakdown 
I structure  (WBS)  (Task  3) . 

t During  the  study,  data  would  be  collected  for  verifying  the  usability 

of  the  WBS  for  gathering  and  reporting  cost  performance  data  (Task  4) , 
and  the  cost  elements  and  WBS  would  be  modified  as  necessary.  The  data 
were  also  to  be  used  In  developing  estimating  relationships. 

Finally,  we  were  to  develop  estimating  relationships  for  man-hours, 
computer-hours,  and  elapsed  time  for  each  of  the  life-cycle  phases  (Task  5) 
Of  particular  concern  were  relationships  of  life-cycle  costs  to  system 
characteristics  and  design  parameters;  differentiating  between  alternative 
designs;  performing  trade-offs  between  development  and  maintenance  costs; 
evaluating  the  effect  of  Engineering  Change  Proposals  (ECPs)  on  costs; 
and  estimating  the  cost  consequences  of  Interface-eaulpment  constraints. 

By  the  first  Technical  Direction  meeting  (June  22)  GRC  had  made  the 
following  progress.  An  Initial  human-resource  list  and  computer-resource 
list  had  been  completed  (Task  1)  as  well  as  an  Initial  cost  category  list 
(Task  2).  Seven  Air  Force  Program  Offices  had  been  visited  (three  at  ESD, 
two  at  ASD,  and  two  at  SAMSO)  for  the  purpose  of  determining  data  avail- 
ability (Task  4),  management  cost-control  problems  (Task  2),  and  typical 
WBSs  (Task  3).  Finally,  a Process  Model  had  been  developed  which 

The  following  Program  Offices  were  visited: 

ESD 

Cheyenne  Mountain  (427-M) 

Combat  Grande 

Over  the  Horizon  Backscatter  Radar 

ASD 

EF-lllA 
F-16 
SAMSO 

Satellite  Central  Facility  (SCF) 

Defense  Support  Program  (DSP) 
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identified  the  phases  of  Air  Force  software  development  and  maintenance 
and  the  information  available  at  each  phase  (Task  5). 

The  results  of  the  visits  to  Program  Offices  were  extremely  impor- 
tant in  shaping  the  study's  direction.  They  were  especially  significant 
in  defining  the  elements  of  the  recommended  reporting  system.  The  Pro- 
gram Offices  have  a difficult  problem  in  controlling  software  cost.  A 
major  reason  is  that  software  costs  are  reported  at  far  too  aggregated 
a level — in  some  cases  as  only  a single  cost.  The  problem  is  complicated 
by  the  Inability  to  relate  software  cost  to  progress  towards  completion 
of  software  development.  Progress  is  often  reported  by  the  percentage 
of  estimated  man-hours  expended  to  date;  a less  than  reliable  estimator 
of  the  amount  of  software  actually  completed,  in  most  cases.  Also, 
since  there  is  no  standard  list  of  software  products  (end  items)  for 
which  resource  requirements  have  been  collected  in  previous  projects, 
the  Program  Offices  have  few  precedents  upon  which  to  base  estimates  of 
resource  requirements. 

The  variability  in  the  recorded  costs  of  previous  developments  is 
due  not  only  to  their  size  and  complexity,  but  also  to  the  following: 

1.  Fixed-price  contracts  have  gone  to  celling.  Hence,  the  costs 
reported  understate  the  costs  incurred. 

2.  Cost-reimbursement  contracts  have  been  augmented  by  supple- 

A 

mentary  agreements  that  Incorporate  changes  In  requirements. 
This  has  resulted  in  redoing  work  already  completed,  so  that 
costs  are  abnormally  high  per  line  of  code  delivered. 

3.  Level-of-ef fort  maintenance  contracts  Include  costs  for 
product  Improvement  as  well  as  error  correction.  Costs  are 
difficult  to  assign  to  these  two  different  functions. 

A.  Software  costs  are  often  hidden  because  of  the  difficulty — 
in  some  Instances — of  allocating  costs  between  hardware  and 
software. 

* ~ ^ 

Generally  initiated  by  Engineering  Change  Proposals  (ECPs) . 
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We  concluded  that  the  Program  Offices  were  not  a good  source  of 
data  on  the  resource  requirements  for  Individual  phases  of  the  life  cycle, 
at  least  within  the  resources  available  In  the  current  contract.  The 
quality  of  the  data  was  poor  because  of  the  lack  of  uniform  reporting, 
as  described  above.  Also,  there  was  not  sufficient  detail  relating  pro- 
duct completion  to  cost.  Finally,  what  data  existed  was  difficult  to 
collect  and  did  not  Include  all  phases  of  the  life  cycle,  especially 
maintenance.  At  some  future  time  the  data  from  the  Program  Offices 
should  be  compiled  Into  a usable  data  base,  but  the  effort  required  will 
be  well  beyond  the  resources  of  this  contract. 

Hence,  at  the  first  TD  meeting  we  proposed  to  take  the  following 
study  approach.  Data  would  be  collected  from  readily  available  sources 
In  which  data  on  software  product  development  could  be  measured.  Of 
special  concern  was  visibility  Into  both  development  and  maintenance 
portions  of  the  life  cycle.  SAMSO  contractors  appeared  to  be  a good 
source,  although  PARMIS  (see  Sec.  5}  turned  out  to  be  the  best  source. 

These  data  would  be  used  to  test  hypotheses  about  resource  con- 
sumption developed  from  the  process  model.  These  relationships  would 
then  be  applied  to  the  aggregated  cost  and  man-hour  data  available  In 
the  literature  (SDC^'^’^  and  ADPREP^^)  to  derive  Interim  estimating 
relationships. 

On  a parallel  path,  the  cost  reporting  system  would  be  defined. 

In  the  long  term.  It  will  be  used  to  establish  an  Improved  cost  data 
base  and  Improved  estimating  relationships. 

Since  that  meeting,  we  have  defined  the  cost  elements  and  WBS 
(Tasks  2 and  3) ; hypothesized  resource  estimating  relationships  (Task  5) ; 
and  collected  data  from  PARMIS,  SDC,  and  IBM  (Task  4).  Changes  to  the 
Process  Model  have  been  made  (Task  5)  and  data  have  been  analyzed  with 
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respect  to  the  hypothesized  estimating  relationships  (Task  5).  Work 
has  concentrated  on  man-hour  estimation. 


Results  were  at  first  disappointing.  Applying  the  data  to  the 
hypothesized  man-hour  relationships  for  the  Individual  phases  of  the 
life  cycle  resulted  In  less  precision  than  estimates  made  on  the  total. 
That  is,  data  plots  of  different  life-cycle  phases  showed  such  "buck- 
shot"  patterns  that  an  estimate  made  by  adding  up  estimates  of  the 
Individual  phases  showed  much  greater  variability  than  aggregated 
estimates  based  upon  relationships  for  total  man-hours. 

It  was  at  this  point  that  the  importance  of  the  relations  between 
phases  became  clear.  Much  of  the  variability  in  the  total,  which  has 
been  seen  previously  by  various  researchers,  was  due  to  differences  In 
man-hour  allocation  among  phases  In  the  development  process.  There  are 
trade-offs  between  the  phases.  For  example,  allocating  too  few  man- 
hours to  analysis  and  design  can  cause  higher  error  rates,  resulting 
In  Increased  man-hours  for  Integration  and  test.  It  could  also  result 
In  large  maintenance  costs,  which  are  not  even  seen  In  the  development- 
cost  data.  Conversely,  efficient  levels  of  analysis  and  design  would 
yield  lower  total  man-hours  or  costs.  Trying  to  estimate , man- hours  for 
each  Individual  phase  only  accentuates  the  variance,  and  the  trade-offs 
Implicit  In  the  totals  are  not  considered  at  all.  Statistically  speak- 
ing, the  total  variance  was  bound  to  be  larger  since  the  covariance 
terms  (some  of  which  are  negative)  were  not  considered. 

If  this  theory  could  In  the  future  be  developed  and  tested,  not 
only  could  we  better  explain  software  costs,  but  we  could  Identify  rules 
for  the  efficient  distribution  of  resources  among  phases.  Using  these 
rules  we  would  have  hope  of  reducing  the  risk  of  software  development, 

g 

a problem  stressed  by  Clapp. 


1.3  ORGANIZATIOK  OF  THIS  REPORT 


This  final  report  gives  a detailed  account  of  our 

work  towards  the  development  of  phase-specific  estimating  relationships 
and  the  cost  reporting  system. 

Sec.  2 defines  a process  model  of  the  software 

life  cycle.  This  model  serves  as  the  basis  for  all  the  later  sections. 

In  Sec.  3,  the  man-hour  estimating  relatlonshlos  hypothesized  are 
given.  Although  they  were  not  substantiated  with  anv  reasonable  degree 
of  success,  they  serve  to  identify  the  key  parameters  and  equation  forms. 

In  Sec.  4,  we  develop  hypotheses  for  the  trade-offs  between  phases. 
Work  towards  testing  these  relationships,  using  the  PARMIS  data  base.  Is 
reported  In  Sec.  5.  Although  results  are  not  statistically  conclusive, 
the  approach  offers  great  potential  for  resource  allocation,  risk  reduc- 
tion, and  eventually  software  cost  estimation. 

Sec.  6 contains  a number  of  estimating  relationship  results  using 
other,  more  aggregated,  data  bases.  The  appropriateness  of  using  data 
on  business-software  developments,  such  as  PARMIS,  to  estimate  resource 
requirements  for  defense-system  software  developments  Is  addressed  by 
examining  differences  In  resource  consumption  between  the  two  types. 

We  believe  that  the  form  of  the  estimating  relationships  should  be  appli- 
cable to  both  types,  although  the  coefficients  will  change.  Also 
Included  are  results  that  demonstrate  (1)  the  Importance  of  several 
other  explanatory  variables  to  software  cost,  (2)  relationships  between 
alternative  measures  of  the  software  product,  and  (3)  characteristics 
of  the  maintenance  activity  of  software. 

In  Sec.  7 some  "rule  of  thumb"  estimating  relationships  and  methods 
of  estimation  are  given.  It  was  felt  that  a presentation  of  some  of  the 
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I estimating  techniques  In  the  literature,  updated  by  the  experience  of  the 

; present  study,  would  be  a useful  supplement  to  the  report.  The  reader 

should  be  cautioned  that  this  was  not  a major  effort  of  the  study,  and 
; does  not  represent  an  exhaustive  attempt  to  pick  the  best  relationships 

■ out  of  the  literature. 

I 

i 

1 

In  Sec.  8,  the  elements  of  the  future  cost  reporting  system  are 
defined.  Also  Included  are  data  requirements  to  measure  product  comple- 
tion, a standardized  list  of  software  end  Items,  man-hour  and  computer- 
hour  reporting  elements,  and  the  relationships  of  these  reporting 
I elements  to  an  existing  automated  report  system. 

I 

I 

f 

Conclusions  of  the  study  and  recommendations  for  future  work  are 
i contained  In  Sec.  9. 
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2 PROCESS  MODEL  OF  THE  SOFTWARE  LIFE  CYCLE 

2.1  INTRODUCTION 

This  software  cost  study  differs  from  earlier  Investigations  In 

two  significant  ways.  First,  It  has  examined  the  entire  life  cycle  of 

software,  rather  than  concentrating  on  the  development  phase.  There  Is 
13 

evidence  that  the  post-development  activities  of  the  software  life 
cycle  account  for  a significant  fraction  of  all  software  costs  and  that 
the  fraction  Is  growing.  Thus,  a life-cycle  cost  model  that  considers 
more  than  development  costs  Is  necessary.  Second,  the  current  Investi- 
gation was  directed  to  organize  Its  resource  estimation  techniques  In 
terms  of  separate  estimators  for  the  component  activities  of  the  soft- 
ware life  cycle.  This  should  be  contrasted  with  aggregated  estimation 

techniques  such  as  those  reported  In  Refs.  14,  15,  16,  and  17. 

« 

A disaggregated  estimating  technique  for  software  resource  con- 
sumption has  Important  advantages.  First,  It  provides  Insight  Into  the 
development  process,  possibly  providing  direction  for  modifying  the 
process.  More  Importantly,  It  can  serve  as  both  a predictor  and  a con- 
trol or  monitoring  mechanism.  Aggregated  techniques  cannot  be  used  to 
detect  cost  problems  during  a life  cycle,  unless  a means  exists  for 
allocating  the  aggregate  costs.  Finally,  disaggregated  techniques  may 
serve  to  reduce  error  In  the  estimates  as  Information  about  the  project 
Improves.  For  example,  explanatory  factors  for  later  activities  may 
be  based  upon  estimates  of  the  outputs  of  the  early  activities.  The 
estimate  of  remaining  costs  can  then  be  Improved  at  the  completion  of 
the  early  activities. 

The  Investigation  started  with  an  oversimplified  view  of  the  esti- 
mators for  the  component  man-hours  (costs)  of  the  software  life  cycle. 

An  aggregated  approach  to  making  an  estimate  Y of  man-hours  Y can  be 
represented  by: 


aggregate 


f (explanatory  factors) 


When  estimating  each  activity  separatelyt  the  representation  is  modified 
as  follows : 


Y 


disaggregate 


trtiere 


f (explanatory  factors  for  activity  J) 


If  such  estimators  are  based  on  regression  analysest  then  the  accu- 
racy of  an  estimator  can  be  quantified  by  the  variance  of  the  estimate. 

For  either  approach,  a variance  can  be  associated  with  each  estimator: 

A A A A 

l.e.,  Y ^ will  have  a variance  V , and  y.  will  have  variance  v, . 

aggregate  agg’  'j  j 

If  the  total  man-hours  are  estimated  by  adding  up  the  component  man-hours, 
the  variance  of  the  sum  is  the  sum  of  Che  variances. 


V 


disagg 


assuming  that  there  are  no  interactions  or  tradeoffs  among  the  activities 
(i.e.,  they  are  Independent). 

Late  in  this  study  it  became  apparent  that  the  variance  of  the  sum 
of  component  estimators  was  generally  larger  than  that  of  known  aggregate 
estimators.  This  would  result  in  a composite  estimate  of  total  man-hours 
that  was  poorer  Chan  an  estimate  made  by  aggregate  techniques. 

It  thus  became  apparent  that  a model  based  upon  Independence  of 
the  activities  is  far  too  simple.  In  statistical  terms,  the  assumption 
of  independence  between  activities  must  be  dropped  and  covariance  terms 
must  be  added  to  the  estimated  variance.  That  is, 

t 

V - 2^v,  +Zcovar., 

disagg  j " Ij 
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Covariance,  in  this  case,  captures  the  interactions  and  trade-offs 
between  the  man-hours  required  for  individual  activities  that  are  not 
apparent  in  a composit'*  estimation  technique. 

Thus,  a second  requirement  has  been  levied  on  the  approach  this 
investigation  was  taking.  That  is,  any  estimation  technique  based  upon 
individual  activities  will  have  to  consider  the  interactions  of  the 
activities  if  the  variance  of  the  life-cycle  estimates  is  to  be  small 
enough  to  make  the  technique  viable.  Preliminary  attempts  have  been 
made  to  change  the  model  to  incorporate  these  interactions,  and  some 
estimates  of  these  interactions  have  been  achieved.  Sample  size  is  too 
small,  however,  for  results  to  be  conclusive. 

We  have  used  the  parametric  approach  to  estimation  throughout  the 
study.  The  approach  is  to  identify  a number  of  physical,  performance, 
or  design  characteristics  which  are  directly  related  to  costs  (or 
resource  consumption),  for  the  system  or  equipment  under  study.  Then, 
historical  data  from  related,  or  similar,  systems  are  used  to  calibrate 
and  verify  a postulated  mathematical  relationship  between  cost  and  tnese 
characteristics  (or  parameters):  the  resulting  equation  is  a cost  estimat- 
ing relatlonsf ip  (CER) . 

The  parametric  approach  uses  principles  of  statistical  Inference; 
we  particularly  emphasize  the  principle  that  a specific  selection  of 
characteristics  (Independent  variables)  and  a specific  mathematical 
form  should  be  hypothesized  on  the  basis  of  a technical  understanding 
of  the  process  under  study.  Then,  statistical  procedures  may  be 
employed  to  verify  the  postulated  CER. 

In  practice,  however,  statistical  procedures  are  often  used  to 
Identify  the  independent  variables  and  to  specify  the  functional  form 
of  the  estimating  equation.  We  believe  this  procedure  is  Incorrect, 
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although  It  Is  difficult  to  be  rigorous  on  this  point.  Parametric  cost 
analysis  Is  largely  exploratory  and  there  Is  Invariably  some  trlal-and- 
error  In  finding  a defensible  CER.  Thus,  a search  (using  statistical 
selection  criteria)  among  a limited  number  of  hypotheses  Is  not  neces- 
sarily to  be  faulted.  An  understanding  of  the  underlying  relationships 
between  dependent  and  Independent  variables  may  be  achieved  only  after 
statistical  Indicators  have  pointed  the  way.  But  there  must  be  an 
eventual  logic  and  rationale  for  the  engineering  and  ecoucmlc  relation- 
ships which  underlie  the  CER. 

The  materials  presented  In  this  and  the  next  two  sections  develop 

the  hypotheses  for  a software  life-cycle  cost  estimating  technique 

•k 

subject  to  the  considerations  previously  mentioned.  This  section  Intro- 
duces a process  model  of  the  software  life  cycle,  upon  which  all  subse- 
quent arguments  are  based.  Sec.  3 develops  hypotheses  about  the  factors 
that  might  be  used  to  explain  the  man-hours  required  for  each  component 
of  the  software  life  cycle,  assuming  Independence  between  the  components. 
Sec.  4 proposes  hypotheses  regarding  the  Interactions  of  the  life-cycle 
components  and  how  such  Interactions  might  be  combined  with  the  other 
explanatory  factors  to  form  a man-hour  estimation  technique.  Validation 
of  the  hypotheses  Is  the  subject  of  Secs.  5 and  6. 

2.2  OVERVIEW  OF  THE  PROCESS  MODEL 

The  Initial  version  of  the  process  model  of  software  development 
was  derived  from  Interviews  at  Air  Force  Program  Offices  and  from  current 
Air  Force  regulations;  In  particular,  AFR  800-14.^^  During  the  contract, 
the  model  was  revised  In  several  respects.  This  section  is  an  overview 
of  the  original  model,  the  revisions,  and  the  reasons  for  the  revisions. 
The  revised  model  Is  then  defined  In  detail  in  Secs.  2.3,  2.4,  and  2.5. 


* 

For  the  most  part  man-hours,  the  most  significant  resource,  are  evaluated 
Instead  of  cost. 
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The  development  of  software  for  use  In  a defense  system  can  occur 
anywhere  In  the  "major  defense  system  acquisition  life  cycle"  shown  In 
Fig.  2.1.  Interviews  at  Program  Offices  that  are  developing  software 

3 

for  communications,  command,  control,  and  Intelligence  (Cl),  and  for 
avionic  systems.  Indicated  that  software  development  usually  occurs 
during  the  validation,  full-scale  development,  or  deployment  phases  of 
the  major  defense  system  acquisition  life  cycle.  However,  software  may 
be  developed  In  any  phase. 


Our  initial  formulation  of  the  Process  Model  for  software  devel- 
opment (Fig.  2.2)  divides  the  software  life  cycle  into  three  major 

* 

phases:  (1)  Development,  (2)  Installation  (or  Production),  and 

(3)  Operation  and  Support,  each  phase  consisting  of  several  activities. 
This  flow-diagram  representation  of  the  process  depicts  the  relation- 
ships between  the  activities  and  the  information  flows  that  define  the 
software  product  and  the  changes  to  it. 

The  Development  Phase  begins  with  the  definition  of  the  software 
to  be  developed,  generally  contained  in  the  System  Specification. 
Trade-offs  between  hardware  and  software  are  assumed  to  have  been  com- 
pleted before  this  point,  although  such  trade-offs  are  often  Included 
as  part  of  the  Software  Analysis  activity. 

During  this  activity  the  software  is  further  defined,  and  divided 
into  several  parts  called  Computer  Program  Configuration  Items  (CPCIs) . 
This  definition  is  formalized  at  a System  Design  Review  (SDR)  along  with 
hardware  Configuration  Items  (CIs).  The  requirements  for  the  software 
are  defined  separately  for  each  CPCI  in  the  Part  I (Development)  Speci- 
licatlons.  These  are  "baselined"  before  the  Preliminary  Design  Review 
(PDR) . A separate  review  is  required  for  each  CPCI,  although  they  are 
sometimes  combined.  Once  the  Part  I Specifications  are  baselined,  any 
action  that  would  change  them  must  be  initiated  by  a formal  Engineering 
Change  Proposal  (ECP) . 

Once  CPCIs  have  been  defined,  resource  consumption  can  begin  to 
be  separately  recorded  for  each  CPCI.  Unfortunately,  there  are  no 
standard  definitions  for  CPCIs,  so  that  comparisons  among  software 
developments  are  not  easy.  Furthermore,  the  CPCIs  as  defined  often 


For  software  the  Production  phase  is  primarily  a process  of  installation. 
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Figure  2.2.  Initiel  Process  Model 
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bear  little  relation  to  the  work  packages  actually  used  in  the  develop- 
ment. A preliminary  definition  of  CPCIs  suggested  in  the  original 
request  for  proposals  is  often  adopted  without  change  in  the  proposal, 
and  carried  on  into  the  development,  even  though  the  Software  Analysis 
results  in  a different  subdivision  for  the  actual  design  and  coding  of 
the  software.  This  discrepancy  can  cause  considerable  difficulties  in 
properly  accounting  for  the  man-hours  expended  in  design  and  coding. 

Detailed  design  is  accomplished  in  the  Software  Design  activity, 
and  documented  in  the  Part  TI  (Product)  Specifications  which  are  prepared 
for  a Critical  Design  Review  (CDR).  A separate  Part  II  Specification  is 
prepared  for  each  CPCI.  At  this  point  each  CPCI  has  been  further  broken 
down  into  Computer  Program  Components  (CPCs)  and  further  into  modules, 
the  actual  units  that  are  to  be  separately  coded.  Coding  and  Checkout 
is  the  next  activity,  with  assembled,  compiled,  and  debugged  modules  of 
software  as  its  product.  No  information  is  currently  made  available  to 
DoD  on  progress  during  this  activity.  Test  and  Integration  follows,  with 
internal  testing  (Computer  Program  Test  and  Evaluation)  preceding 
qualification  testing.  Prototype  hardware  may  be  Introduced  for  these 
tests.  The  Software  Development  phase  ends  with  an  accepted  product  for 

each  CPCI  at  Functional  and  Physical  Configuration  Audits  (FCA  and  PCA — 
Mil  Std  1521A). 

We  have  described  the  process  in  sequence.  As  Fig.  2.2  shows, 
however,  there  are  numerous  possible  backtrackings  or  "feedback  loops." 
Some  of  these  take  the  form  of  new  requirements  expressed  in  Engineer- 
ing Change  Proposals  (ECPs),  either  externally  or  Internally  generated. 
When  these  cause  changes  in  the  Part  I Specifications,  they  require  a 
repetition  of  analysis,  design,  coding,  etc.  Similarly,  redesign  can 
occur  that  changes  the  Part  II  Specifications,  requiring  recoding  and 
retesting.  However,  there  is  no  documentary  record  of  this  loop,  since 
Part  II  Specifications  are  rarely  baselined  until  after  qualification 
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testing.  Thus,  even  though  there  Is  considerable  looping  through  the 
activities,  only  the  loops  that  Involve  changes  In  Part  I Specifications 
are  recorded  and  visible  outside  the  development  group.  Furthermore, 
there  Is  no  way  to  track  the  progress  of  development  between  CDR  and 
the  beginning  of  formal  qualification  testing. 

Software  Is  then  Installed  (a  phase  that  corresponds  to  Production 
In  a hardware  procurement),  and  enters  the  Operation  and  Support  phase. 
During  this  phase  of  the  life  cycle.  It  Is  often  Impossible  to  separate 
software  from  hardware  activities.  For  example,  the  Fault  Detection 
activity  shown  In  Fig.  2.2  may  result  from  either  a hardware  malfunction 
or  a bug  In  the  software. 

Fault  Isolation  may  result  In  the  detection  of  a software  error. 

In  that  case,  software  maintenance  responds  to  the  error,  possibly 
requiring  new  analysis,  design,  coding,  etc.  The  development  cycle  Is 
essentially  repeated  (although  a change  In  specifications  could  hardly 
be  termed  "maintenance"). 

The  process  model  has  several  features  that  distinguish  It  from 
earlier  models  of  software  development: 

1.  It  addresses  the  entire  life  cycle  of  software,  not  only 
development. 

2.  It  Identifies  parameters  that  could  be  used  to  measure  the 
changes  of  state  In  the  software  during  development. 

3.  It  Is  oriented  to  USAF  standards  and  regulations. 

The  Initial  formulation  of  the  process  model  based  on  AFR  800-14^^  also 
has  several  shortcomings,  which  have  led  to  the  revised  process  model 
shown  In  Fig.  2.3.  (We  should  emphasize  that  acceptance  of  these  revi- 
sions will  require  modification  of  AFR  800-14.) 
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CHANGES  TO  CPCI  PART  I SPECS 


Figure  2.3.  Revised  Process  Model 


First,  the  fact  that  Part  II  Specifications  are  rarely  baselined 
before  qualification  testing  means  that  man-hours  spent  in  design  changes 
and  resulting  recoding  before  formal  qualification  testing  cannot  be 
separately  measured.  This  Is  shown  in  Fig.  2.3  by  the  dashed  lines  and 
boxes  connected  to  the  Internal  Test  and  Integration  activity. 

In  effect,  these  redesign  and  recoding  activities  are  included  in 
Internal  Test  and  Integration.  We  believe  that  these  activities  are  a 
primary  cause  of  variability  In  software  cost.  It  Is  easy  to  see  how  a 
poorly  conceived  Software  Design  activity  could  lead  to  a large  number 
of  errors  In  the  Coding  and  Checkout  activity  and  thus  to  a great  deal 
of  redesign.  Since  this  consequence  Is  not  separately  monitored  and 
occurs  late  In  the  development  phase.  It  can  be  the  most  Important  cause 
of  cost  and  schedule  problems.  Reducing  the  amount  of  redesign  and 
recoding  Is  a primary  way  to  reduce  the  risk  In  software  development; 
but  It  can  only  be  accomplished  by  understanding  the  relationships 
between  design,  coding  errors,  and  redesign.  That  Is,  the  trade-offs 
of  resources  between  activities  are  of  prime  Importance. 

Desirable  as  It  would  be  to  account  separately  for  redesign  and 
recoding,  we  do  not  recommend  an  attempt  to  separate  them  In  Internal 
Test  and  Integration.  The  activities  are  so  Intertwined  that  separation 
would  be  nearly  Impossible.  Hence,  the  activities  Indicated  by  dashed 
lines  In  Fig.  2.3  are  here  treated  as  part  of  the  Internal  Test  and 
Integration  activity. 

Note  that  there  are  two  testing  activities.  Internal  Test  and 
Integration  Includes  all  the  testing  and  Integration  that  occur  before 
formal  qualification  testing.  Little  or  no  visibility  Is  available 
during  this  activity.  During  Qualification  Test,  on  the  other  hand. 


Also  known  as  Computer  Program  Test  and  Evaluation. 


the  contractor  is  carrying  out  formal  teats,  which  are  viewed  by  Air 
Force  personnel.  Tests  include  Preliminary  Qualification  Tests  (PQT) 
and  Final  Qualification  Tests  (FQTs).  Results  are  formally  documented, 
and  corrections  of  any  errors  that  are  discovered  can  be  recorded. 

A second  shortcoming  of  the  initial  process  model  is  that  activi- 
ties do  not  have  clearly  defined  beginnings  and  ends.  They  may  overlap 
in  time,  as  Figs.  2.2  and  2.3  suggest,  rather  than  being  clearly  sep- 
arated by  "milestones."  Milestone  definitions  are  very  convenient, 
however.  For  exan^le.  Software  Design  can  be  conveniently  defined  as 
starting  with  PDR  and  ending  with  CDR.  A further  conq>licatlon  is  that 
many  software  developments  have  reported  man-hours  by  such  milestone 
divisions,  but  others  have  reported  them  using  "activity  definitions" 
of  terms  such  as  analysis,  design,  and  coding.  This  discrepancy  has 
caused  confusion  as  well  as  variability  in  the  data.  AFR  800-14  per- 
petuates this  practice  by  using  attributes  of  both  definitions  when 
it  refers  to  "phases". 

In  this  report  we  will  try  to  avoid  confusion  in  terms  by  adopting 
the  following  convention.  When  the  terms  analysis,  design,  and  so  on 
are  used  without  qualifiers,  an  activity  definition  is  intended.  When 
the  terms  are  followed  by  phase  or  milestone,  a milestone  definition  is 
intended. 

In  general,  we  use  "milestone"  definitions  of  the  activities 
because  that  appears  to  be  the  way  the  data  has  most  often  been  reported. 
When  analysis,  design,  etc.  are  defined  in  this  manner,  they  form  clearly 
separable  phases,  with  no  feedback  other  than  ECPs  (which  we  argue  should 
have  their  own  development  cycle) . 

However,  in  several  sections  of  this  report  we  make  use  of  activity 
definitions.  In  Secs.  4 and  5,  the  PARMIS  data  allows  such  a definition. 
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which  Is  Important  to  measuring  overlap  among  activities,  a key  to  the 
study  of  trade-offs  between  activities  (as  well  as  phases).  In  Sec.  8 
(the  reporting  elements)  both  definitions  are  used.  A transition  matrix 
which  relates  one  to  the  other  Is  explicitly  defined.  This  matrix 
(Table  8.14)  will  be  useful  In  separating  the  two  uses  of  these  terms. 

The  milestone  definitions  have  the  following  boundaries.  Software 
Analysis  (milestone)  Includes  all  activities  up  to  PDR;  the  date  of  PDR 
can  be  different  for  each  CPCI.  Software  Design  (milestone)  ends  at 
CDR,  for  each  CPCI.  Coding  and  Checkout  (milestone)  ends  with  the  base- 
lining of  the  source  deck,  for  each  module.  Internal  Test  and  Integra- 
tion (milestone)  ends  with  the  beginning  of  the  first  qualification  test. 
Qualification  testing  (milestone)  ends  with  the  acceptance  of  each  CPCI 
at  PCA.  Installation  (milestone)  Is  then  Initiated,  and  ends  with 
Initial  Operational  Test  and  Evaluation  (lOT&E)  at  the  operating  sites. 
Operation  and  Support  then  commences. 

Between  Coding  and  Checkout  (milestone)  and  Internal  Test  and 
Integration  (milestone),  we  have  defined  a "moving  milestone,"  Indica- 
ted by  the  Jagged  separation  In  Fig.  2.3.  That  Is,  the  end  of  Coding 
and  Checkout  Is  reported  separately  for  each  module  (or  CPC) , as  a 
means  of  tracking  the  progress  of  software  development.  No  milestone 
or  means  of  tracking  now  exists  between  these  activities,  so  the  addition 
will  greatly  assist  Program  Offices  In  the  control  of  software  develop- 
ment. This  topic  will  be  explained  further  In  Sec.  8. 

The  flow  of  maintenance  activities  during  the  Operation  and 
Support  phase  has  also  been  refined  significantly  In  Fig.  2.3.  The 
effects  of  new  requirements  have  been  separated  from  those  of  fault 
detection,  so  that  true  "maintenance"  activities  can  be  Identified. 

There  remain,  however,  two  sources  of  confusion  between  maintenance 
activities  and  new  requirements.  For  one  thing,  an  ECP  that  specifies 
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new  requirements  may  also  Incorporate  a number  of  corrections  of  non- 
crltlcal  errors.  For  another,  the  correction  of  errors  detected  during 
Qualification  Testing  of  an  ECP  may  be  treated  as  maintenance  activity 
and  thus  not  be  associated  with  the  ECP  that  caused  the  errors  to  be 
Introduced. 

The  foregoing  has  been  an  overview  of  the  reasons  for  revising  the 
Initial  process  model  into  the  form  shown  In  Fig.  2.3.  This  revised 
model  Is  the  framework  for  the  hypotheses  concerning  the  factors  that 
explain  software  costs  (Sec.  3)  and  for  the  Work  Breakdown  Structure 
presented  in  Sec.  8.  The  following  three  subsections  refer  to  Fig.  2.3 
and  give  definitions  for  the  terms  used  throughout  the  rest  of  this 
report. 

2.3  DEVELOPMENT  PHASE 

The  development  phase  of  software  begins  with  a general  system 

18 

specification,  or  system  segment  specification.  This  system  may  be 
totally  new  or  a modification  of  an  existing  system.  In  any  case,  the 
system  specification  Is  available  before  the  start  of  software 
development . 

The  functional  and  performance  requirements  of  the  specification 

are  then  partitioned  among  one  or  more  Computer  Program  Configuration 

Items  (CPCIs).  Each  CPCI  undergoes  Software  Analysis  (milestone), 

18 

which  formulates  a Part  I Specification.  The  Software  Analysis 
(milestone)  terminates,  for  each  CPCI,  at  PDR  after  the  Part  I Speci- 
fication for  that  CPCI  Is  baselined.  This  baselining  establishes  the 
Allocated  Configuration  Identification  and  subjects  the  CPCI  to  formal 
configuration  controls  (MIL  Std  480  and  483).  This  baselining  should 
occur  before  PDR  (Preliminary  Design  Review). 

At  PDR  the  Software  Design  (milestone)  begins.  During  this  period 

18 

a complete  draft  of  Part  II  Specifications  and  a more  detailed  design 
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of  the  CPCT  are  generated.  Software  Design  (milestone)  for  each  CPCI 
terminates  with  a Critical  Design  Review  (CDR)  of  Its  Part  II  Speci- 
fication. The  design  reviews  of  all  related  CPCIs  are  generally  held 
simultaneously.  The  Part  II  Specifications  are  usually  subjected  only 
to  Informal  configuration  controls  between  CDR  and  Physical  Configura- 
tion Audit  (PGA) » despite  the  fact  that  all  CPCI  coding,  debugging, 
testing,  and  Integration  occur  between  these  two  milestones. 

At  CDR,  Coding  and  Checkout  (milestone)  begins.  It  terminates 
when  the  code  for  each  module  (or  CPC)  has  been  written,  assembled  or 
compiled,  and  separately  tested  (unit-tested)  to  the  programmer's 
satisfaction.  There  is  no  single  milestone  for  the  completion  of  this 
activity;  the  "moving  milestone"  marks  its  completion  for  each  module 
of  software.  The  product  of  this  phase  is  the  version  of  the  code  to 
be  used  in  testing  and  Integration  of  the  CPCI. 

Functions  during  Internal  Test  and  Integration  (milestone)  are 
not  exclusively  associated  with  Individual  CPCIs.  The  purpose  Is  to 
Informally  test  each  CPCI,  collections  of  CPCIs,  and  the  software  as 
a whole.  During  this  period,  special  hardware  may  be  Introduced  for 
some  tests.  The  total  cost  of  Internal  Test  and  Integration  must  thus 
be  viewed  from  the  system  level  and  Is  not  always  attributable  to 
Individual  CPCIs  or  even  exclusively  to  software. 

Each  CPCI  has  a set  of  test  plans  and  procedures,  usually  relat- 
ing back  to  the  Part  I Specification.  Preliminary  and  Formal  Qualifi- 
cation Tests  assure  that  each  CPCI  can  pass  this  specific  set  of  test 
procedures.  These  are  performed  during  Qualification  Test  (milestone). 
Like  Internal  Test  and  Integration,  the  costs  during  this  period  can- 
not be  easily  allocated  to  Individual  CPCIs. 


After  the  Part  I Specification  and/or  the  Part  II  Specification 
are  baselined,  any  subsequent  change  to  the  specifications  Is  tracked 
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by  Engineering  Change  Proposals  (ECPs)  and  Specification  Change  Notices 
18 

(SCNs).  Fig.  2.3  Indicates  flows  from  each  milestone  period  back  to 

/ 'I 

the  prior  period.  These  feedbacks  represent  supplemental  agreements^ 
to  modify  baselined  Part  I or  Part  II  Specifications  and  software  before 
delivery  of  the  first  version  of  the  system.  Each  modification  of  a 
specification  requires  repeating  a number  of  the  previous  activities. 

Such  changes  can  be  internally  generated  or  can  arise  for  external 
reasons.  In  the  form  of  new  requirements. 

Figure  2.3  also  shows  paths  from  Internal  Test  and  Integration 
(milestone)  to  redesign  and  recoding.  Conceptually,  It  Is  possible  to 
exercise  formal  configuration  control  over  the  checked-out  software  during 
test  and  Integration  by  using  a development  library.  If  a problem  de- 
tected during  testing  required  redesign,  recoding,  and  retesting,  these 
activities  could  then  be  attributed  to  design,  coding,  and  testing  activ- 
ities respectively.  Practically,  however,  it  Is  not  possible  to  require 
a contractor  to  directly  measure  (separate)  and  report  these  resources 
against  these  activities. 

2.4  INSTALLATION  PHASE  (PRODUCTION) 

The  Installation  (Production)  phase  of  software  Includes  the 

generation  of  multiple  copies  of  the  software  and  their  Installation  In 

3 

computer  systems.  For  C I systems.  Installation  may  require  site- 
dependent  modifications  of  standard  software  packages.  For  avionics 
systems,  installation  usually  means  only  distributing  Identical  copies 
of  the  software.  If  multiple  copies  of  the  software  are  not  required, 
then  there  Is  no  installation  (production)  phase  In  the  software  life 
cycle,  since  the  test  site  will  presumably  be  the  only  operational  site. 

2.5  OPERATION  AND  SUPPORT  PHASE 

Operation  and  Support  (maintenance)  of  software  must  be  examined 
from  two  different  points  of  view:  error  correction,  and  modification. 
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Error  correction  consists  of  repairing  reported  difficulties  with 
the  operation  of  the  software.  The  errors  that  are  corrected  are  gen- 
erally those  which  prevent  the  software  from  accomplishing  its  alloca- 
ted functions.  By  definition,  error  correction  does  not  require  changes 
to  the  Part  I or  Part  II  Specifications.  Maintenance  may  be  concurrent 
with  operations.  Errors  may  also  be  corrected  simultaneously  on  multiple 
versions  of  software. 

Modification  of  an  existing  model  or  version  of  software  produces 
a new  version  of  the  software.  It  begins  with  revising  the  Part  I or 
Part  II  Specifications,  either  to  correct  deficiencies  in  the  software's 
performance  or  to  expand  its  capabilities.  Modifications  generally  also 
include  correction  of  errors  that  were  temporarily  repaired,  or  deferred, 
as  part  of  the  error  correction  activities. 

2.5.1  Error  Correction 

Error  correction  is  embodied  in  three  concurrent  activities  (Fig. 

2.3):  operations,  fault  isolation,  and  maintenance.  Operations  is  that 
activity  which  exercises  the  software  to  accomplish  the  Intended  mission. 

In  some  applications,  this  activity  may  require  computer  operators,  etc., 
the  costs  of  which  can  be  viewed  as  operation  costs  of  the  software.  In 
other  applications  (such  as  avionics)  the  costs  of  operation  have  little 
to  do  with  the  software. 

When  the  Operations  activity  encounters  problems  with  the  system, 
it  reports  them  as  faults.  The  Fault  Isolation  activity  then  screens 
the  reported  faults  and  traces  their  cause  to  the  hardware  or  software. 

The  costs  of  this  activity,  like  those  of  Operations,  cannot  be  entirely 
attributed  to  software  since  the  faults  may  lie  with  hardware.  Those 
faults  that  are  traced  to  software  are  the  responsibility  of  the  Main- 
tenance activity. 

'3 


I 


2-17 


The  Maintenance  activity  may  alter  either  the  object  code  or  the 
source  code  or  both.  The  definition  of  the  CPCl  requirements  and 
functions,  as  expressed  by  the  Part  I and  Part  II  Specifications,  Is 
not  altered.  This  activity  (unlike  Operations  and  Fault  Isolation) 
can  be  charged  completely  to  software. 

2.5.2  Modification 

Modification  of  software  Is  an  activity  that  also  occurs  during 
operation  and  maintenance.  Experience  with  the  operation  of  a system 
will  Identify  deficiencies  In  the  original  requirements  and  functions 
of  the  system.  In  addition,  enhancements  will  be  required  because  of 
changes  In  the  mission  of  the  system.  These  changes  In  requirements 
will  potentially  affect  the  software  components  of  the  system. 

Deficiencies  In  the  design  of  the  software  may  be  Identified  by 
the  Fault  Isolation  activity.  Such  deficiencies  are  those  that 
require  changes  In  the  specifications  of  the  software,  rather  than 
corrections  to  make  It  accurately  reflect  the  existing  specifications. 
The  design  changes  will  normally  be  accumulated  and  merged  with  changes 
that  reflect  desired  extensions  of  the  software's  functions. 

A new  model  or  version  of  the  software  may  be  commissioned  as 
part  of  the  Operations  and  Support  phase.  Developing  the  new  model 
begins  with  an  SON  directing  changes  to  the  software.  Nominally, 
these  changes  will  be  Incorporated  through  a series  of  activities 
Identical  to  those  of  the  Development  phase.  A major  difference  Is 
that  the  new  model  of  the  software  Is  derived  by  modifying  existing 

bodies  of  code  and  specifications  rather  than  developing  the  software 

A 

from  scratch.  Another  difference  between  the  modlflcAtlon  and 


An  example  of  the  Important  distinction  between  these  two  activities 
is  the  Defense  Satellite  Program  software.  The  multiple  versions  of 
the  CONUS  Ground  Station  software  are  modifications,  while  the 
Simplified  Processing  System  (SPS)  software  constitutes  a new 
development. 


development  activities  Is  that  modification  incorporates  some  coding 
changes  to  repair  outstanding  problems  with  the  software  which  were 
not  deemed  critical  and  tuerefore  not  formally  incorporated  during 
error  correction. 
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EXPLANATORY  FACTORS  FOR  COSTS  OF  INDIVIDUAL  PHASES 


This  section  presents  the  hypotheses  we  Initially  developed  about 
explanatory  factors  for  the  costs  of  software  Development,  Installation, 
and  Operation  and  Support.  A major  choice  at  the  outset  was  to  develop 
hypotheses  for  several  generic  types  of  software  that  are  of  Interest 
to  ESD,  and  to  Ignore  the  fine  structure  of  the  software.  That  Is, 
we  did  not  consider  relationships  that  depended  on,  say,  the  number 
of  Input/output  statements  or  the  hierarchy  of  subroutine  calls.  As  will 
be  seen,  a major  explanatory  variable  we  did  choose  is  the  overall  size 
of  the  software  (number  of  object-code  statements.  In  general).  Given 
the  generic  type  of  software,  and  the  hardware  on  which  it  is  to  be  used, 
this  instruction  count  is  estimated  early  in  project  development.  It  is 
a readily  available  descriptor  of  previously  developed  software,  although 

* 

it  generally  does  not  include  non-deliverable  code,  such  as  support  tools. 

This  section  discusses  explanatory  hypotheses  only  for  man-hours. 

The  other  important  resources — computer-hours  and  elapsed  time — were  not 
considered  because  of  the  shift  of  attention  to  the  overriding  issue  of 
the  relations  among  activities. 

The  framework  for  this  discussion  is  the  Process  Model  described 
in  Sec.  2.  The  phases  are  assumed  to  be  defined  by  milestones  (PDR,  CDR, 
etc.)  rather  than  by  activity  divisions  (e.g.,  design,  coding,  and  test), 
since  this  definition  makes  it  easier  to  collect  data  from  previous  soft- 
ware development.  (If  the  reporting  system  described  in  Sec.  8 is  adopted, 
future  developments  will  record  cost  data  by  cost  elements  that  correspond 
more  naturally  and  accurately  to  the  activities.) 


Non-dellverable  code  should  be  a part  of  the  count,  as  resources  are 
required  for  its  development.  However,  it  most  surely  is  not  Included 
in  historical  data  and  is  implicitly  Included  In  estimating  relationships 
only  because  it  causes  an  apparently  smaller  number  of  lines  of  code  per 
man-month. 


As  we  have  explained  earlier,  our  attempts  to  quantify  the  follow- 
ing hypotheses  were  not  successful  because  of  the  significance  of  trade- 
offs among  phases.  However,  since  the  definition  of  hypotheses  for 
the  cost-determining  factors  of  the  separate  phases  in  software  develop- 
ment has  not  been  attempted  before,  we  consider  it  Important  to  pub- 
lish these  hypotheses  so  that  they  can  be  subjected  to  critical  review 
by  the  software  conunity. 

Our  attempts  to  demonstrate  the  significance  of  some  of  these  re- 
lations are  reported  in  Sec.  6.  In  particular,  the  maintenance  rela- 
tionships are  evaluated,  and  the  effect  of  the  choice  of  programming 
language  is  explored. 

3.1  DEVELOPMENT  PHASE 

Recall  that  the  Development  Phase  is  divided  into:  (1)  Analysis, 

(2)  Design,  (3)  Coding  and  Checkout,  (4)  Internal  Test  and  Integration, 
and  (5)  Qualification  Testing. 

3.1.1  Analysis  Phase 

Analysis  (milestone)  Is  defined  as  the  work,  completed  no  later  than 
PDR,  that  generates  baselined  Part  I Specifications  for  each  CPCI.  It  is 
assumed  that  the  staffing  of  this  phase  Is  homogeneous,  so  that  the 
resources  can  be  estimated  In  man-hours  and  then  converted  to  dollars  by 
a single  cost-per-man-hour  number.  The  factors  that  are  selected  to  ex- 
plain the  labor  resource  are  chosen  because  they  are  readily  measured  (for 
past  projects)  and  estimated  (for  future  projects);  and  because  they  are 
end  products  of  the  Analysis  Phase  and  Its  successor  phases. 

Milestone  definitions. 

An  inhomogeneous  labor  mix,  as  shown  by  Wolverton,  is  significant  In 
determining  the  overall  cost  of  development.  For  individual  soft- 
ware phases  during  Development,  we  believe  that  the  inhomogeneity  is 
insignificant  in  determining  the  cost  per  man-hour  since  software 
phases  correspond  closely  to  job  skill.  We  therefore  assert  an  average 
cost  per  man-hour  for  each  phase  (not,  however,  the  same  for  all  phases). 
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The  first  hypothesis  Is  that  the  labor  required  for  Analysis  stra- 
tifies as  a function  of  the  type  of  software  being  developed.  This 
hypothesis  was  selected  rather  than  an  estimation  technique  based  on 

program  structure.  Five  types  of  software  of  Interest  to  ESD  are  con- 
* 

sldered:  (1)  operational  flight  programs;  (2)  tactical  mission 

control  programs;  (3)  command,  control,  communications,  and  Intelligence 

(C  I)  programs;  (4)  slmulator/tralner  programs;  and  (5)  automatic  test 

equipment  software.  Each  of  these  five  types  of  software  represents  a 

problem  with  unique  development  and  operational  characteristics.  For 

example,  operational  flight  programs  are  Implemented  on  small  embedded 

computer  systems  and  are  generally  concerned  with  functions  related  to 

3 

guidance  and  weapon  systems.  C I software  Is  Implemented  on  large  ground- 
based  computers.  Tactical  mission  control  programs,  like  operational 
flight  programs,  are  Implemented  on  small  embedded  computers,  but  are 
distinguished  by  their  functions  and  the  more  frequent  need  for  maintenance 
as  missions  change.  It  Is  hoped  that  this  simple  stratification  will 
encompass  and  quantify  many  of  the  effects  that  are  generally  attributed 
to  "problem  complexity"  and  "function".  Stratification  will  be  represented 
by  developing  an  estimation  technique  with  a common  mathematical  form, 
but  with  different  constants  for  each  type  of  software. 

The  second  hypothesis  Is  that  analysis  labor  Is  proportional  to 
some  power  of  the  "size"  of  the  problem  being  analyzed.  Denoting  the 
size  by  X,  : 


This  classification  of  ESD  software  projects  was  suggested  by 
Mr.  Dan  Fitzgerald  at  Wrlght-Patterson  AFB. 

k 

The  value  of  Xi  Is  assumed  to  be  summed  over  all  CPCIs.  It  Is  the 
total  magnitude  of  the  system  that  Is  assumed  to  contribute  to  the 
complexities  of  subdividing  It  Into  CPCIs  and  correctly  defining 
their  Interfaces,  Interactions,  and  specifications. 
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where  the  constants  a.  and  a„  are  separately  determined  for  each  of  the 
five  types  of  software. 

The  size  of  the  problem,  Is  surely  measurable  in  some  sense  by 
the  size  of  the  software  that  Is  to  be  developed;  and  the  size  of  the 
software  Is  most  straightforwardly  measured  by  the  number  of  Instruc- 
tions, either  In  the  source  language  or  In  the  machine  language  (object 
language) . Studies  of  total  software-development  cost  have  suggested  that 
using  the  number  of  source- language  statements  as  a measure  produces 
estimates  with  smaller  variance.  (All  total-cost  estimators,  however, 
have  very  large  variances.)  When  estimating  the  costs  of  the  software 
phases  separately,  It  Is  possible  to  use  source-language  statements  as 
a measure  for  some  activities  and  object-language  statements  for  others, 
as  we  do  here. 

For  the  Analysis  phase,  what  Is  wanted  Is  a consistent  measure  of 
the  size  of  the  problem  to  be  solved,  without  reference  to  the  programming 
(source)  language  that  will  be  used  to  solve  It.  For  this  reason,  we  adopt 
object- language  statements  as  the  measure  of  size  for  this  activity. 

Because  different  computers  require  different  numbers  of  object-language 
statements  to  express  the  same  operations,  the  number  of  object-language 
statements  should  be  adjusted  for  the  type  of  machine  to  be  used  (either 
by  stratifying  the  data  or  by  defining  a multiplier  that  depends  on 
machine  type).  Thus,  X.  is  defined  as  the  number  of  object-language 
statements  in  the  delivered  software,  adjusted  for  machine  type. 


** 


The  power  curve  has  been  selected  to  accommodate  apparent  dlseconoftles 
of  scale  In  the  software  development  process.  This  nonlinear  behav- 
ior was  Indicated  In  early  SDC  studies. 

If  all  programs  were  coded  In  the  same  source  language,  say  JOVIAL, 
then  source- language  statements  would  be  a better  measure,  since  that 
would  automatically  compensate  for  different  machine  types. 


3-A 


\ 


The  final  hypothesis  Is  that  the  relationship  of  analysis^  labor 
to  product  should  depend  on  whether  the  Analysis  phase  is  ad^pting  an 
existing  software  specification  to  a new  application  or  working  froa  an 
entirely  new  specification.  Thus,  separate  estiaators  of  the  saae  fora 
should  be  developed  for  each  of  these  two  categories  of  projects;  that 
is,  the  constants  a^  and  a2  also  depend  on  this  factor. 

An  alternative  measure  of  the  product  might  be  the  size  of  the 
Part  I Specifications.  As  an  Independent  variable,  it  could  be  substi- 
tuted for  the  variable  in  the  above  equation  (%rlth  corresponding 
adjustments  to  and  a^) . One  difficulty  with  using  this  measure  is 
that  a Program  Office  has  some  control  over  the  format  and  content  of 
Part  I Specifications.  This  would  invalidate  the  general  use  of  the  size 
of  Part  I Specifications  in  the  comparison  of  projects.  It  Is  of  some 
Interest,  however,  to  test  the  hypothesis  that  the  size  of  the  final  prod- 
uct (measured  by  an  instruction  count)  Is  related  to  the  size  of  the 
Part  I Specifications,  and  therefore  interchangeable  as  the  Independent 
variable.  This  hypothesis  could  be  tested  by  collecting  data  on  a single 
project  with  multiple  CPCIs,  all  subject  tc  the  same  specification  formats. 
The  demonstrated  existence  of  a relationship  would  be  the  basis  for 
further  Investigation  of  alternative  measures  of  product  magnitude. 

3.1.2  Design  Phase 

The  Design  phase  Is  defined  as  the  work,  beginning  at  PDF.  and  ending 
with  CDR,  that  generates  Initial  Part  II  Specifications.  It  Is  again 
assumed  that  staffing  can  be  viewed  as  homogeneous,  but  perhaps  with  a 
different  cost  per  man-hour  than  for  the  Analysis  phase.  The  motivation 
for  choosing  the  explanatory  factors  is  the  same  as  before. 

As  with  Analysis,  the  first  hypothesis  is  that  the  labor  require- 
ments stratify  as  a function  of  the  type  of  software  being  developed. 
Stratification  is  again  represented  by  developing  an  estimation  technique 
with  a coimnon  mathematical  form,  but  with  different  constants  for  each 
type  of  software.  The  same  types  of  software  listed  before  are  assumed. 
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The  second  hypothesis  is,  sgain,  that  labor  man-hours  are  propor- 
tional to  some  power  of  the  size  of  the  software.  In  the  Design  phase, 
as  well  as  in  Analysis,  the  programming  language  to  be  used  should  not 
affect  the  estimate;  h’>ace,  we  again  adopt  object-language  statements  as 
the  measure  of  size.  However,  the  Analysis  phase  often  identifies  some 
portions  of  the  problem  as  suitable  for  implementation  by  existing  soft- 
ware. These  portions  need  not  be  considered  in  the  Design  phase.  Hence, 

A 

we  define  a size  measure,  , as  the  number  of  object-language  statements 
in  the  delivered  software,  adjusted  as  before  for  machine  type,  minus 
the  number  of  statements  copied  from  existing  software.  The  relationship 
for  the  Design  phase  is  then: 

**2 

Man-Hours,  Design  « 

where  b^^  and  are  constants  separately  determined  for  each  of  the  types 
of  software. 

The  third  hypothesis  is  that  the  relationship  of  design  labor  to 
product  depends  on  whether  the  system  to  be  developed  is  an  adaptation  of 
an  existing  system  or  entirely  new.  Separate  estimators  of  the  same  form 
should  be  developed  for  these  two  categories  of  projects;  l.e.,  the  con- 
stants b^  and  b2  depend  on  this  factor. 

It  is  again  of  secondary  Interest  to  test  for  a relationship  between 
the  final  size  of  the  product  of  the  software  development  (code)  and  that 
of  the  Immediate  product  of  the  Design  phase  (a  Part  II  Specification). 

It  is  hypothesized  that  the  size  of  a Part  II  Specification  varies  with  the 
form  of  specifications  selected  by  the  Program  Office  and  with  the  func- 
tions of  the  software  being  specified.  Using  data  from  a project  with 

multiple  CPCIs,  all  with  the  same  required  form  of  specifications,  it  is 

- 

In  this  equation  X2  is  assumed  to  represent  the  individual  instruction 
count  for  a CPCI.  The  total  costs  of  Design  are,  of  course,  summed 
over  all  CPCIs. 
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possible  to  remove  the  effect  of  the  first  variable.  A regression  analy- 
sis can  then  be  used  to  test  if  a relationship  between  the  size  of  the 
Part  II  Specification  and  the  original  estimate  of  program  size  must 
also  Include  a dependence  on  the  functions  Implemented  by  the  CPCI. 

It  would  also  be  of  interest  to  test  for  a relationship  between  the  sizes 
of  a Part  I and  a Part  II  Specification  in  a similar  manner. 

The  existence  of  such  relationships,  perhaps  dependent  on  the  form 
of  the  specifications,  would  be  useful  in  monitoring  the  status  of 
projects.  Original  estimates  of  program  sizes  could  be  compared  with 
the  size  of  the  Specifications  to  signal  items  that  may  require  re- 
estimation  and  management  review. 

3.1.3  Coding  and  Checkout  Phase 

The  Coding  and  Checkout  phase  is  defined  as  the  work  that  begins 

* 

after  CDR  and  ends  with  the  start  of  Internal  Test  and  Integration.  It 
consists  of  translating  Part  Specifications  to  source  code  that  can 


* 

It  is  recognized  that  no  single  point  in  time  corresponds  to  the  com- 
pletion of  this  phase.  Conventional  "bottom-up"  developments  will 
generally  code  and  unit-test  all  of  the  modules  of  a system,  deferring 
any  attempts  to  test  the  modules  as  an  Integrated  unit.  When  all 
modules  have  been  coded  and  tested,  the  system  enters  in  internal  inte- 
gration and  testing  phase  prior  to  qualification  testing.  Mills  has 
pointed  out  that  it  is  usually  during  this  period  of  activities  that 
difficulties  arise.  Structured  programming  and  "top-down"  developments 
try  to  avoid  deferring  integration  testing  until  all  modules  have  been 
coded  and  tested  as  individual  units,  by  encouraging  the  integration 
and  testing  of  the  software,  as  a system,  as  soon  as  each  new  module 
is  developed.  This  philosophy  should  manifest  Itself  as  a sequence  of 
points  at  which  modules  transition  from  Coding  and  Checkout  to  Internal 
Test  and  Integration. 
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be  translated  and  Interpreted  by  the  computer  system,  and  unit-testing 

A 

that  source  code.  It  Is  again  assumed  for  this  activity  that  staffing 
Is  homogeneous  so  that  the  costs  of  labor  can  be  averaged,  perhaps  with 
a different  cost  per  man-hour  than  for  the  Analysis  or  Design  activity. 
Again,  the  motivation  for  choosing  the  explanatory  factors  Is  the  same  as 
In  the  previous  hypotheses. 

The  first  hypothesis  Is  that  Coding  and  Checkout  labor  Is  propor- 
tional to  some  power  of  the  size  of  the  actual  software  coded  and  checked 

A* 

out.  In  this  case  the  size,  , Is  measured  by  the  delivered  source- 

language  statements,  excluding  any  code  that  Is  taken  directly  from 
another  system.  The  form  of  the  estimator  Is 

Man-Hours,  Coding  and  Checkout  ~ 


The  second  hypothesis  Is  that  the  relationship  of  labor  to  product 
should  depend  on  the  choice  of  programming  language.^  A variable  Is 
Introduced  whose  value  Is  zero  If  programming  Is  done  in  assembly  language 
and  1 If  In  a higher-order  language.  The  form  of  the  equation  becomes 

X4  C2 

Man-Hours,  Coding  and  Checkout  = c^  *^1^3 


Unit  testing  consists  of  those  tests  that  a programmer  might  apply  to 
a single  unit  of  code  to  convince  himself  (or  herself)  that  the  unit 
of  code  functions  as  specified.  In  particular,  the  programmer  Is  not 
attempting  to  test  the  Interfaces  of  the  module  with  other  modules  of 
the  system.  This  Is  contrasted  to  the  testing  described  In  Secs.  3.1.4 
and  3.1.5  In  which  the  programmer  Implicitly  (or  explicitly)  has  de- 
clared a unit  of  code  to  be  functioning  correctly  and  Is  testing  the 
operation  of  that  and  other  units  of  code  when  Integrated  to  form  the 
CPCI.  This  level  of  testing  might  Include  exhaustive  logic  and  data 
range  tests  at  the  module  level. 

* 

X3  Is  assumed  to  be  measured  with  respect  to  a single  unit  of  code. 

The  total  cost  of  coding  and  checkout  activities  Is  developed  by  summing 
costs  over  all  of  the  units  of  code  to  be  developed. 

t 20 

This  effect  has  been  documented  by  a number  of  published  results. 
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If  a system  contains  software  In  more  than  one  language » estimates  for 
the  separate  parts  must  be  summed  to  estimate  the  total. 


The  final  hypothesis  Is  that  the  relationship  of  labor  to  product 
should  also  depend  on  whether  the  software  system  Is  severely  limited 
for  hardware  resources.  Coding  takes  considerably  more  effort  If  the 
limitations  of  the  machine  must  be  taken  Into  account.  This  effect  was 
first  discussed  In  Ref.  21  and  further  quantified  in  Ref.  22  as  a dis- 
crete variable.  In  Ref.  22,  a system  is  considered  to  be  so  limited  If 
more  than  95%  of  the  available  memory  is  used.  The  same  definition  is 
proposed  here.  A variable  X,  is  Introduced  whose  value  is  zero  if  the 
hardware  constraint  is  not  present  and  1 If  It  Is: 


Man-Hours,  Coding  and  Checkout 
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Note  that,  for  this  phase,  the  estimator  does  not  depend  upon  the 
type  of  software,  since  Analysis  and  Design  activities  have  already  been 
accomplished  for  the  most  part. 


Testing  the  hypotheses  about  the  Coding  and  Checkout  phase  requires 
that  the  boundaries  of  the  activity  be  well  defined.  Currently,  Air 
Force  regulations  do  not  Identify  a milestone  that  can  be  used  as  the 
termination  boundary  of  this  activity.  Thus,  the  boundary  between  the 
Coding  and  Checkout  phase  and  the  Internal  Test  and  Integration  phase 
Is  ambiguous.  We  assume  that  data  can  be  collected  to  describe  pro- 
jects that  used  the  same  definition  of  the  boundary,  and  that  none  of  those 
projects  Included  major  revisions  in  requirements  or  design.  If  so,  the 
validity  of  the  hypotheses  can  be  tested  by  regression  analyses.  More 
careful  data  collection  (as  proposed  in  Sec.  8)  will  ultimately  avoid 
this  problem,  but  for  now  this  is  the  best  that  can  be  done. 


3.1.4  Internal  Test  and  Integration  Phase 

The  Internal  Test  and  Integration  phase  begins  with  the  conpletlon 
of  Coding  and  Checkout  (for  Individual  modules)  and  terminates  with 
the  start  of  Qualification  Testing.  It  Is  often  difficult  to  ascribe 
the  cost  of  this  phase  entirely  to  software;  In  avionics  systems,  for 
example.  Internal  Test  and  Integration  conanonly  addresses  the  functions 
of  both  hardware  and  software  concurrently.  Thus,  one  should  view  the 
proposed  estimation  technique  as  predicting  the  cost  of  only  the  soft- 
ware portion  of  the  Internal  Test  and  Integration  activity. 

The  software  testing  activities  might  include  several  Items:  the 
Integration  of  modules  to  form  CPCIs,  and  the  Informal  rehearsal  of 
qualification  tests  that  are  to  be  formally  applied  to  the  software. 

These  tests,  when  formally  applied,  would  be  considered  part  of  the  Quali- 
fication Testing  phase  discussed  In  the  next  section. 

In  this  phase,  we  believe  that  It  Is  not  adequate  to  assume  that 
the  labor  mix  Is  homogeneous.  The  reason  Is  the  "hidden"  redesign  and 
recoding,  discussed  In  Sec.  2,  that  often  make  up  a large  part  of  this 
phase.  We  have  not  proposed  a hypothesis  to  account  for  this  effect;  It 
is  Intimately  related  to  the  problem  of  relationships  anong  phases  to 
be  discussed  In  Sec.  4.  If  It  were  to  be  treated,  an  explanatory  variable 
might  be  the  duration  of  this  phase.  The  longer  the  time  spent  In 
Internal  Test  and  Integration,  the  more  likely  It  Is  that  higher-paid 
analysts  and  designers  must  be  averaged  Into  the  labor  mix. 

The  first  hypothesis  Is  that  Internal  Test  and  Integration  labor 
requirements  also  stratify  as  a function  of  the  type  of  software  being 
developed.  The  stratification  will  again  be  represented  by  developing 
an  estimation  technique  with  a common  mathematical  form,  but  with  differ- 
ent constants  for  each  type  of  software.  The  same  software  types  used 
for  the  Analysis  phase  will  be  assumed. 
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The  second  hypothesis  is  that  the  labor  requirements  vary  as  some 
power  of  the  size  of  the  component  that  must  be  subjected  to  testing. 
Each  CPCI  separately  must  pass  a nund>er  of  test  procedures  to  qualify 
It  for  acceptance.  The  size  Is  measured  by  the  number  of  delivered 
object- language  statements,  X^.  The  measurement  Is  for  a single  CPCI. 
The  form  of  the  estimator  Is: 

d2 

Man-Hours,  Internal  Test  and  Integration  (each  CPCI)  » 

The  third  hypotl^esljs  Is  that  the  relationship  between  labor  and 
product  also  depends  on  thte  programming  language  and  on  the  constraints 
that  may  be  Imposed  by  limited  computer  resource^  (as  discussed  under 
Coding  and  Checkout).  Each; of  these  factors  can  modify  the  effort 
required, to  Identify  and  reaplr  problems  encountered  during  testing. 

I 

Man-Hours,  Internal  Test  and  Integration  (each  CPCI) 


where  and  X^  are  the  dummy  variables  defined  under  "Coding  and  Check- 
out . " 

The  fourth  hypothesis  Is  that  "top-down"  development  techniques 

should  have  an  Impact  on  the  costs  of  Internal  Test  and  Integration. 

This  Is  modeled  by  Introducing  a new  dummy  variable,  Xg,  to  capture 

the  Impact  of  structured  programming  and  "top-down"  development  on 

software  development  costs.  The  variable  X,  takes  the  value  1 If 

o 

"top-down"  methods  are  employed  and  0 If  otherwise.  The  form  of  the 
equation  becomes 

Man-Hours,  Internal  Test  and  Integration  (each  CPCI) 
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since  this  estimator  applies  to  the  testing  efforts  for  a single  CPCI, 
the  total  labor  must  be  summed  over  all  the  CPCIs  In  the  product. 

3.1.5  Qualification  Testing  Phase 

The  Qualification  Testing  activity  begins  with  the  completion  of 
Internal  Test  and  Integration  and  terminates  with  the  acceptance  of  the 
CPCI  at  PCA.  As  with  Internal  Test  and  Integration,  It  Is  difficult  to 
ascribe  the  costs  of  acceptance  testing  to  software  as  distinct  from 
hardware.  Again,  one  should  view  the  following  estimation  technique  as 
applying  only  to  the  software  component  of  Qualification  Testing. 

It  will  be  assumed  that  the  staffing  of  this  activity  Is  homogeneous; 
the  average  cost  per  man-hour  may  be  different  than  those  of  the  preceding 
phases.  Unlike  Internal  Test  and  Integration,  redesign  and  recoding  should 
be  minimal.  If  not,  they  are  certainly  visible  to  Air  Force  Inspectors. 

The  motivation  for  choosing  explanatory  factors  Is  the  same  as  that  used 
in  formulating  the  previous  hypotheses  for  analysis  and  desgln. 

Oualiflcation  Testing  Is  generally  much  less  expensive  than  Internal 
Test  and  Integration,  Its  component  costs  can  be  attributed  to  the  testing 
of  Individual  Computer  Program  Components  and  to  the  testing  of  the  CPCI 
as  a whole.  Further,  it  Is  conjectured  that  the  testing  effort  required 
Is  related  to  the  size  of  the  delivered  product,  measured  In  object-language 
Instructions  (adjusted  as  before  for  differences  In  machine  type) , and 
to  the  number  of  CPCIs  whose  interaction  needs  to  be  tested.  This  phase, 
like  most  of  the  others,  should  be  stratified  by  the  type  of  software  being 
developed.  The  general  form  of  the  estimation  technique  Is 

Man-Hours,  Qualification  Testing 


where 


number  of  CPCIs  needed  In  testing  (including  CPCI  being  tested). 


3,1.6  Changes  In  Requirements  and  Design 

Few  software  developments  are  free  of  changes  in  requirements  and 
design,  whether  done  at  no  (visible)  cost  or  by  renegotiation  of  the 
contract.  Fig.  2.3  Illustrates  these  with  solid  lines.  It  Is  conjec- 
tured that  the  change  activity  Is  significant  In  explaining  the  observed 
variations  In  software  cost. 

One  approach  to  testing  the  conjecture  would  be  to  treat  design 
changes  as  another  activity  to  be  costed  separately.  Any  data  used  In 
testing  the  previously  developed  hypotheses  would  have  to  be  adjusted 
for  any  negotiated  changes  In  cost,  and  be  based  on  the  best  estimate 
of  the  size  the  software  would  have  been  without  the  changes.  The 
approach  would  require  developing  a labor  estimation  technique  for 
changes  In  requirements  and  design.  Such  a technique  would  have  to  re- 
flect the  state  of  development  at  the  time  the  change  Is  Introduced,  and 
be  responsive  to  the  magnitude  of  the  change. 

Because  changes  may  reduce,  extend,  or  modify  the  scope  of  a 
development,  a cost  estimation  technique  would  be  difficult  to  develop. 
Changes  that  reduce  scope  eliminate  costs  that  would  be  Incurred  by  some 
fraction  of  the  project.  If  development  were  to  continue  without  the 
change.  Extensions  of  the  scope  Introduce  new  costs  that  would  not 
have  otherwise  occurred,  and  new  costs  due  to  Interfacing.  Modifications 
of  scope  produce  some  combination  of  both  effects. 

In  many  instances,  changes  In  requirements  and  design  are  performed 
"at  no  cost."  It  is  believed  that  such  changes  do.  In  fact,  have  an 
impact  and  can  be  used  to  explain  variations  In  observed  costs. 

We  have  attempted  to  develop  cost  hypotheses  for  changes  regarded 
as  another  development  activity.  In  Sec.  6,  we  do  examine  a hypothesis 
for  aggregated  development  cost,  with  changes  represented  by  a parameter: 
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Man-Months.  Development  * 

where  is  a size  measure  (number  of  statements)  and  Z^  Is  the  nundier 
of  changes  (ECPs) . 

3.2  INSTALLATION  (PRODUCTION)  PHASE 

Recall  that  the  production  phase  of  a software  system  consists  of 
Its  Installation  on  one  or  more  computers.  It  Is  assumed  that  the  staf- 
fing of  this  phase  of  activities  Is  homogeneous;  again,  the  cost  of  a unit 
of  labor  may  be  different  than  for  other  activities.  The  same  motivations 
apply  for  the  choice  of  Independent  parameters. 


The  first  hypothesis  Is  that  the  effort  of  Installation  depends 
linearly  on  the  number  of  computers  that  are  to  receive  the  software,  X^. 
The  form  of  the  cost  estimating  relationship  Is: 


Man-Months,  Installation  “ f^  + fj^X^  \ 


The  second  hypothesis  Is  that  the  proportional  relationship  of 
cost  to  the  ntmber  of  computers  Is  modified  If  software  adaptations  to 
different  computers  are  required.  Let  Xg  take  the  value  0 or  1,  de- 
pending on  whether  any  computer  requires  specific  modifications  to  be 
made  In  the  software.  The  form  of  the  cost  estimating  relationship  Is 
modified  to 

Xg 

Man-Hours,  Installation  “ "*■  ^2  ^1*7 


The  final  hypothesis  regarding  Installation  cost  Is  that  the  equa- 
tion should  be  stratified  by  the  type  of  software  being  Installed;  that 
Is,  the  f's  depend  on  software  type.* 


t 3 

Clearly,  Installing  C 1 software  with  site-dependent  modifications  Is 
different  from  Installing  avionics  software  In  a wing  of  aircraft. 
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3.3  OPERATIONS  AND  SUPPORT  PHASE 


Recall  that  the  Operations  and  Support  phase  of  software  can  be 
viewed  as  two  separate  activities:  error  correction  and  modification. 

These  are  the  only  two  activities  that  can  be  completely  charged  to 
the  software  life  cycle.  The  steps  of  modification  are  nominally  Iden- 
tical to  those  of  development.  It  Is  hypothesized  that  the  form  of  the 
cost  estimating  relationship  for  modification  should  be  the  same  as 
for  development.  However,  one  should  not  expect  the  coefficients  to  be 
Identical.  The  number  of  versions  (modifications)  required  Is  hypothesized 
^ to  be  proportional  to  the  operational  life  of  the  system,  where  the  con- 

! stant  of  proportionality  depends  on  type  of  software, 

f 

For  error  correction.  It  Is  assumed  that  the  staffing  Is  homogen- 
( eous;  the  cost  per  man-hour  may  differ  from  that  of  other  phases.  The 

I motivations  for  choosing  Independent  parameters  are  the  same. 

The  first  hypothesis  Is  that  the  cost  of  error  correction  Is 
linearly  dependent  on  the  number  of  reported  errors  per  unit  time, 

X^(t).  The  form  of  the  cost  estimating  equation  Is 

Man-Hours,  Error  Correction  (per  unit  time)  * + 82^9^^^ 

The  expected  number  of  errors  to  be  corrected  per  unit  time  Is 
clearly  dependent  on  the  system  lifetime.  It  Is  hypothesized  that  the 
distribution  of  errors  over  the  system  lifetime  is  of  the  form  specified 
In  Fig.  3.1. 
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Figure  3.1.  Number  of  Reported  Errors  Per  Month 


The  second  hypothesis  Is  that  the  constant  of  proportionality 

In  the  above  equation  is  dependent  upon  the  size  of  the  software  package 
* 

being  maintained.  The  form  of  the  cost  estimating  equation  is  modified 
to: 

®4 

Man-Hours,  Error  Correction  (per  unit  time)  - gj^  + g^X^^  Xg(t) 
where  of  the  previous  equation  Is  given  by 

84 

and  Xj  is  the  size  of  the  software  measured  In  object-language  Instruc- 
tions, as  before. 


* 

The  larger  the  software  system,  the  more  difficult  It  Is  to  isolate 
the  cause  of  the  software  problem. 
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The  third  hypothesis  is  that  software  maintenance  costs  are  deter- 
mined by  the  method  of  software  development.  In  particular,  the 
constant  of  proportionality  is  further  modified  by  the  use  of  a high- 
order  programming  language  and  by  the  presence  of  a hardware  constraint. 
The  cost  estimating  relationship  is  further  modified  to 

Man-Hours,  Error  Correction  (per  unit  time) 

^4  ^5  *4 

= gj  + 85  gg  gjXj^  Xg(t) 

where  and  are  the  dummy  variables  defined  earlier  to  account  for 
programming  language  and  hardware  constraint. 

The  final  hypothesis  is  that  software  maintenance  costs  are  deter- 
mined, in  part,  by  the  type  of  software  being  maintained.  Thus,  the 
coefficients  of  the  above  equation  must  be  derived  for  each  type  of 
software  considered. 
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HYPOTHESIZED  RELATIONSHIPS  BETWEEN  ACTIVITIES 

The  construction  of  estimating  relationships  by  considering  each  1 

activity  Independently  Implies  that  the  relationships  for  one  activity  | 

are  not  affected  by  parameters  associated  with  a different  activity.  For  | 

example,  we  assumed  in  Sec.  3 that  the  man-months  required  for  Coding  | 

and  Checkout  (milestone)  could  be  predicted  by  software  and  computer  re-  ! 

i 

source  parameters,  and  that  they  would  not  be  significantly  affected  by  j 

1 

the  man-months  spent  In  Design  (milestone)  or  In  Testing  (milestone) . 

J 

This  section  considers  the  construction  of  estimating  relationships 
on  the  hypothesis  that  the  separate  phases  or  activities  are  tied  together 
In  predictable  ways,  and  that  It  Is  not  possible  to  develop  estimating 
relationships  without  considering  these  Interactions.  (In  fact,  we  can 
painfully  attest  to  the  Inability  to  make  sense  out  of  man-hour  data 
for  Individual  phases  or  activities  without  considering  these  trade-offs) . 

This  approach  holds  that,  besides  the  relationships  between  the 
cost-driving  parameters  and  the  software-related  characteristics,  there 
are  trade-offs  between  resources  expended  In  one  phase  or  activity  and 
resources  expended  In  another.  For  example.  Increasing  the  resources 
spent  In  design  would  tend  to  decrease  the  resources  required  for  testing 
or  maintenance.  The  objective  of  this  section  is  to  develop  this  type  of 
relationship. 

4.1  A SOFTWARE  LIFE-CYCLE  COST  MODEL  WITH  RELATIONSHIPS  BETWEEN 
ACTIVITIES  OR  PHASES 

The  Justification  for  using  a model  with  Interrelated  phases  or 
activities  /or  analyzing  software  cost  derives  from  two  propositions: 

• Analyses  using  the  simpler  Independent-phases  approach  have 
not  succeeded.  Man-hour  variations  within  activities  are 
large.  This  led  us  to  try  a more  complex  model. 


• Analysis  of  detailed  data  from  software  development  projects 
provides  substantial  evidence  of  why  and  how  the  phases  or 
activities  affect  one  another. 

In  this  section,  we  examine  some  data  on  the  activities  In  an  actual  data 

processing  development.  (The  data  were  sufficient  to  use  the  "activity 

definitions"  for  Analysis,  Design,  etc.)  We  shall  show  that.  In  addition 

to  the  basic  problem  of  translating  a set  of  requirements  Into  reliable, 

correct  computer  programs,  there  are  Influences  caused  by  the  need  to 

conform  to  a development  plan.  The  plan  Is  an  essential  management  tool 

for  ensuring  that  needed  resources  are  available  to  the  project  at  the 

proper  time  and  In  the  correct  amounts.  One  would  like  to  see  how  changes 

In  the  plan,  cr  sed  either  by  changes  In  requirements  or  by  failure  to 

meet  commitments,  affect  cost-driving  parameters.  Particularly,  one 

would  like  to  see  how  management  actions  Influence  the  measurable  pro- 
* 

ject  descriptors.  Understanding  this  relationship  should  permit  a more 
accurate  analysis  of  the  cost-driving  variables. 

Figure  4.1  shows  the  time  spans  and  levels  of  effort  for  the 
different  phases  of  a software  development  project.  (Data  are  derived 
from  the  PASltllS  data  base  described  In  Sec.  5.)  Planned  values  are  shown 
by  solid  lines  and  actual  values  by  dashed  lines.  The  example  Is  a 
business  program  written  In  COBOL. 

To  begin  with,  there  Is  considerable  scheduled  overlapping  of  the 
design  and  coding  activities.  A milestone  definition  of  these  activities, 
such  as  that  used  In  Sec.  3,  misses  these  overlaps.  This  overlapping  Is 
a common  practice,  but  It  Increases  the  likelihood  that  changes  In  the 
design  will  require  parts  of  the  system  to  be  recoded.  Such  a schedule 
might  have  been  adopted  because  time  was  short  or  because  certain  people 
were  only  available  at  certain  times.  In  either  case,  overlapping 

causes  any  problems  or  delays  to  have  Increased  Impact  on  the  work. 

- 

Prdject  descriptors  Include  man-hours  for  analysis,  coding,  testing,  etc. 
(planned  and  actual),  time  span  for  the  activities,  numbers  of  personnel, 
application  classification,  etc. 


TIME 
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Figure  4.1.  Scheduled  and  Actual  Activities  In  a Software  Developnent: 
Example  1 


The  Analysis  activity  of  the  project  was  carried  out  at  about  the 
* 

planned  level  of  effort.  The  long  delay  In  its  coaq>letion  was  accompanied 

by  some  delay  In  the  start  of  the  Design  activity.  This  latter  delay  was 
probably  beneficial  since,  as  will  be  seen  In  the  next  figure,  failure 
to  delay  the  start  of  an  overlapped  activity  can  Increase  scheduling 
problems.  However,  completion  of  Analysis  lagged  until  three  months  after 
Its  scheduled  date,  and  In  the  meanwhile  both  the  Design  and  Coding 
activities  were  started. 

The  delay  In  the  completion  of  Analysis  indicates  that  some  needed 
Information  was  missing,  some  procedures  were  not  defined,  or  unexpected 
problems  occurred.  Going  ahead  with  the  Design  and  Coding  activities 
would  almost  assure  an  Increase  In  the  time  required  to  code  and  test  the 
system.  This  Is  a good  example  of  one  type  of  Interaction  between  activi- 
ties. The  Increased  coding  and  testing  hours  would  not  be  predicted  by 
any  method  that  relied  solely  on  parameters  related  to  those  activities. 

The  delays,  along  with  the  substantial  overlapping,  were  associated 


with  a significant  Increase  In  the  resources  required  to  complete  the 
project : 


Activity 

Estimated 

Man-Months 

Actual 

Man-Months 

Percent 

Increase 

Analysis 

4.5 

6.6 

47 

Design 

9.1 

9.9 

9 

Coding 

4.5 

19.6 

336 

Test  & Integration 

4.2 

10.6 

152 

Qualification 

Testing 

2.2 

6.0 

173 

24.5 

52.7 

115 

We  divided  total  man-hours  by  time  span  to  determine  average  staffing. 
Therefore,  any  gaps  In  the  work  would  reduce  the  calculated  staffing. 


4-4 


The  figures  indicate  that,  for  this  project,  delays  and  overlapped 
activities  were  associated  with  large  Increases  In  the  consumption  of 
resources  over  what  was  expected.  The  combination  of  delays  and  parallel 
activities  has  a compounding  detrimental  effect  on  a project  schedule.* 
For  example,  when  coding  Is  begun  before  the  completion  of  design,  the 
designers  are  required  to  communicate  their  results  to  the  programmers 
In  a raw,  unqualified  state  (hence  significantly  Increasing  the  chance 
of  design  errors) . Overlapping  also  raises  the  possibility  that  the 
designer  may  not  change  a poor  procedure  when  he  discovers  It,  because 
he  has  already  committed  himself  to  the  programmer.  Many  times  the  pro- 
grammer may  fill  In  missing  Information  by  himself.  By  doing  this  he 
may  Introduce  errors  Into  the  system  that  will  not  be  discovered  until 
late  In  the  testing  program  when  repairs  will  be  time-consuming  and  ex- 
pensive. 

Adding  programmers  to  a project  that  Is  behind  schedule  also 

Introduces  a communication  problem  and  an  associated  decrease  In  pro- 

20 

ductlvlty.  Brooks's  Law  Is,  "Adding  manpower  to  a late  software 
project  makes  it  later."  The  existing  staff  must  take  time  to  lay 
out  all  the  ground  rules  for  the  new  members  and  describe  all  the  details 
of  the  system.  Since  the  system  Is  behind  schedule,  the  documentation 
Is  sparse  and  usually  outdated.  The  result  Is  that  often  It  takes 


•k 

This  is  not  to  suggest  that  systems  cannot  be  developed  with  overlapping 
activities.  Many  systems  have  distinct  parts  that  can  be  coded  before 
the  entire  design  Is  completed.  In  a top-down  design  where  coding  Is  by 
tiers,  the  coding  can  often  begin  before  the  design  Is  complete.  These 
are  planned  developments  that  would  permit  the  overlapping  of  these 
functions.  We  are  concerned  here  with  the  situation  where  the  press  of 
the  development  schedule  or  the  slippage  of  preceding  activities  results 
in  overlapping  activities  that  would  have  been  accomplished  better 
sequentially.  Even  In  a planned  Implementation  of  parallel  activities, 
however  (and  this  Includes  top-down  design) , whenever  the  coding  begins 
before  the  design  Is  completed  there  Is  an  Increased  risk  of  changes  to 
the  design  or  of  mismatches  In  subsystem  Interfaces.  The  nroject  manage- 
ment must  weigh  these  risks  In  relation  to  the  need  for  workload  balancing 
and  project  scheduling. 


4-5 


I 

I 

f 

i 


, more  tlae  to  explain  what  Is  to  be  done  than  to  do  It.  Moreover,  the 
situation  is  ripe  for  producing  mistakes  which  can  easily  cause  the  group 
to  be  less  productive  than  before  the  additional  staff  was  assigned. 

In  the  example  we  are  considering,  there  was  an  additional  reason 
for  the  increase  in  expended  hours.  The  project  was  subjected  to  changes 
in  the  functional  requirements  during  its  development.  It  is  difficult  to 
say  what  proportion  of  the  increases  were  caused  by  these  changes,  but  we 
believe  that  the  introduction  of  some  changes  is  normal  during  project 
development.  In  this  instance,  changes  made  a bad  situation  much  worse. 

We  are  not  yet  trying  to  build  a case  for  cause-and-ef feet  relation- 
ships between  delays  and  overlapping  and  increased  constsaptlon  of  resources. 
We  are  simply  using  an  existing  project  history  to  illustrate  how  inter- 
actions among  activities  may  be  seen  to  influence  the  expected  resource 
requirements.  On  the  basis  of  a single  project  one  could  simply  conclude 
that  the  project  was  poorly  planned  and  executed.  However,  examining 
two  other  projects  lends  supporting  evidence. 

Figure  4.2  is  for  another  business-application  project.  Here  again 
there  is  heavy  overlapping  of  activities.  In  this  case,  however,  delays 
in  the  completion  of  analysis  and  design  were  not  accompanied  by  a slip  in 
the  start  of  coding.  Sometimes  a project  manager  must  assign  personnel 
to  his  project  or  face  the  possibility  of  losing  them.  In  many  instances 
the  programmers  will  be  scheduled  for  another  project  on  completion  of 
the  present  effort  and  cannot  delay  the  start  of  coding  without  jeopardi- 
zing the  other  project.  If  that  was  the  case  in  this  example,  however, 
they  still  missed  the  start  of  the  subsequent  project  by  nearly  three 
months. 

As  a final  example,  Figure  4.3  shows  a project  that  was  completed 
in  a better  fashion  than  the  preceding  two.  The  analyala  and  design 
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Figure  4.3.  Scheduled  and  Actual  Activities  in  a Software  Develoftaent : 
Example  3 


activities  overlapped  but,  significantly,  the  design  was  completed  before 
coding  started.  Less  effort  was  put  into  the  analysis  than  had  been 
planned,  but  there  was  an  Increase  in  the  hours  required  for  the  design:* 


Activity 

Estimated 

Man-Months 

Actual 

Man-Months 

Percent 

Increase 

(Decrease) 

Analysis 

43 

24 

(44) 

Design 

12 

16 

33 

Cod Ing 

28 

37 

32 

Test  and  Integration 

12 

9 

(25) 

Qualification  Testing 

3 

6 

100 

98 

92 

(6) 

The  project  was  completed  on  schedule. 

Notice  that  the  projects  described  in  Figs.  4.2  and  4.3  were 
scheduled  to  have  the  analysis  phase  continue  until  after  the  completion 
of  the  design  activities.  This  occurs  most  often  when  the  Part  1 Speci- 
fications are  not  fully  developed  at  the  start  of  the  project.  Functional 
requirements  are  allowed  to  change  during  the  design  phase  much  more  than 
a pragmatic  approach  would  dictate.  This  is  the  case  with  many  information- 
system  developments  where  management  participation  in  defining  functional 
requirements  is  not  sufficient.  As  details  of  the  design  become  estab- 
lished, the  Impacts  of  the  specifications  become  more  apparent  to  members 
of  management,  and  their  reactions  require  changes  in  the  specifications. 
Many  project  managers,  therefore,  do  not  attempt  to  finalize  the  Part  I 
Specifications.  Instead,  they  schedule  the  analysis  and  design  phases 
concurrently.  The  period  of  analysis  after  the  completion  of  the  design 

is  used  to  complete  the  documentation  of  the  Part  I Specifications. 

_ 

Many  software  develooment  activities  are  difficult  to  define.  The  line 
between  analysis  and  design  becomes  blurred  in  practice.  In  some  instances 
both  functions  are  performed  by  the  same  individual,  who  may  also  do 
some  or  all  of  the  coding.  It  may  be  that  in  this  Instance  some  of  the 
analysis  hours  were  renorted  as  design. 
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Business-oriented  or  Information-system  development  projects  are 
not  unique  In  this  practice  of  overlapping,  the  analysis  and  design  actlv- 

3 

Itles.  It  also  occurs  In  non-business  applications,  C I,  and  other  types 
of  software  development  projects.  It  happens  whenever  the  press  of  a 
schedule  does  not  allow  a proper  definition  of  the  Part  I Specifications 
or  when  there  Is  not  sufficient  knowledge  of  the  requirements  to  formu- 
late good  specifications.  If  this  situation  exists,  and  a formal  life- 
cycle development  model  Is  Imposed  on  the  project,  the  specification 
changes  get  reported  as  part  of  the  coding  and  subsequent  activities. 
Analysis  of  project  data  from  this  point  of  view  suggests  that  the 
practice  may  be  quite  common.  It  Is  probably  the  most  difficult  problem 
to  cope  with  when  trying  to  use  the  reports  of  resource  expenditures  to 
determine  the  underlying  controlling  factors. 

The  preceding  discussion  has  been  presented  In  support  of  the  con- 
tention that  the  relationships  among  the  software  development  phases 
are  extensive  and  very  Important  to  the  consumption  of  resources.  As 
will  be  shown  later,  these  relationships  extend  Into  the  operation  phase. 

In  consideration  of  the  arguments  presented  above,  the  model 
software  development  cycle  shown  In  Fig.  A. 4 will  be  used  as  the  basis 
for  formulating  life-cycle  man-hour  relationships.  The  model  Is  a gener- 
alization of  the  principal  features  Identified  In  the  examples  taken 
from  actual  projects. 

The  analysis,  design,  and  coding  activities  have  been  overlapped 
to  represent  a project  constrained  by  its  PCA  completion  date.  The 
coding  and  integration-and-test  activities  have  been  staffed  at  a higher 
level  than  would  have  been  used  without  the  constraint.  This  would  nor- 
mally mean  that  a natural  separation  of  the  project  along  functional 
lines  is  altered  to  permit  the  assignment  of  additional  programmers. 

The  model  shows  a delay  in  completing  the  design,  accompanied  by  the 
addition  of  more  staff  to  the  coding  and  testing  effort.  The  completion 
of  the  testing  phase  is  shown  as  slipping.  The  result  of  the  way  in 


4-9 


POR  COR  PCA 


which  the  project  actually  proceeds  Is  an  increased  error  rate  in 
the  developed  software  (over  that  planned  for  when  originally  as- 
signing the  maintenance  staff).  The  maintenance  staffing  is 
correspondingly  increased. 

4.2  TRADE-OFFS  DURING  DEVELOPMENT 

We  have  shown  how  interactions  between  project  activities  occur  and 
some  of  the  direct  Influences  on  resource  requirements.  In  this  section 
we  discuss  other  consequences  that  result  from  deliberate  actions  by 
project  managers.  We  are  concerned  with  decisions  to  alter  planned 
allocations  of  resources  caused  by  departures  from  the  development 
program  schedule.  We  also  consider  the  effects  of  calculated  risks  taken 
by  managers  in  an  attempt  to  bring  slipping  projects  back  on  schedule. 

In  order  to  analyze  the  development  trade-offs,  it  is  useful  to 
establish  some  hypothetical  reference  conditions  or  benchmarks.  For  each 
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software  development  activity,  there  is  some  ideal  allocation  of  resources. 
Figure  4.5  shows  how  these  ideal  allocations  for  each  activity  might 
look,  taking  the  representative  cost-driving  parameter  to  be  lines  of 
source  code. 

In  the  first  diagram,  the  linear  function  defines  the  man-months 
of  analysis  and  design  required  for  the  estimated  size  of  the  system 
being  developed.  Because  the  design  and  coding  activities  overlap  in 
the  model  development,  the  man-hours  expended,  represented  by  the  + sign, 
is  far  below  the  ideal  line.  The  project  manager  has  accepted  the  risk 
that  changes  in  design  will  delay  the  completion  of  coding,  to  avoid  the 
schedule  slip  which  would  have  resulted  if  coding  had  been  delayed  until 
design  was  complete. 


The  second  chart  indicates  that  the  total  expenditure  on  analysis 
and  design  was  close  to  the  ideal.  However,  because  the  completion  of 
design  took  place  later  than  planned  (see  Fig.  4.4),  the  coding  effort 
suffered  a decrease  in  productivity.  Under  these  parallel  development 
circumstances,  the  quality  of  the  design  would  be  expected  to  suffer. 


The  third  chart  shows  that,  as  expected,  the  coding  and  testing 
activities  consumed  more  resources  than  would  have  been  indicated.  The 
next  chart  indicates  that  the  balance  between  the  coding  and  testing  activ- 
ities was  consistent  with  the  ideal.  The  last  two  charts  show  that  the 
scheduled  maintenance  effort  was  less  than  the  ideal,  while  the  reported 
faults  were  higher.  The  higher  fault  rate  occurs  because  the  testing 
effort  was  not  sufficient  to  properly  test  the  changes  related  to  the 
delay  in  completing  the  design.  The  result  of  the  mismatch  between  the 
maintenance  effort  and  the  error  rate  is  a system  that  is  forced  to  limp 

it 

In  order  to  describe  the  concept  of  trade-offs  and  phase  relationships 
more  easily,  we  have  made  two  simplifying  assumptions:  first,  we  will 
assume  that  the  single  descriptor  of  the  end  product  is  the  number  of 
lines  of  source  code;  and  second,  we  will  assume  that  given  the  lines 
of  source  code,  there  is  some  ideal  allocation  of  resources  for  each 
d<?velopment  activity.  The  discussion  then  describes  how  violating  these 
hypothetical  Ideals  influences  the  different  activities. 


LINES  OF  CODE 


LINES  OF  CODE 


Figure  4.5.  "Ideal”  and  Typical  Resources  Expended  In  Each 
Life-Cycle  Activity 
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along  with  minor  deficiencies  while  the  major  problems  are  patched  by 
an  overworked  maintenance  staff. 

The  preceding  discussion  was  Intended  to  Illustrate  how  the 
assumed  primary  cost-driving  parameters  are  Influenced  by  decisions  made 
during  the  development  project.  During  each  activity,  management  must 
make  decisions  that  may  increase  development  costs,  cause  slips  in 
schedule,  or  risk  problems  later  In  the  software  life  cycle.  We  have 
looked  at  one  set  of  decisions  that  affect  the  cost-driving  parameters 
through  activity  interrelationships.  Many  other  sets  of  decisions  are 
possible. 

Figure  4.6  Illustrates  some  other  departures  from  the  Ideal  which 
may  occur,  and  how  they  may  be  reflected  In  the  error  rate  of  the  deliv- 
ered software,  which  Is  indicative  of  the  reliability  of  the  software. 
One  external  cost-driving  parameter,  changes  (ECPs) , has  been  added  to 
those  included  in  Fig.  4.5.  The  ">",  ”<”,  and  indicate  whether  the 
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Figure  4.6.  Postulated  Trade-offs  Among  Life-Cycle  Man-Hour 
Paraaieters 
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project  experience  was  above,  below,  or  equal  to  the  ideal  for  the  given 
activity.  Eight  columns  show  eight  different  sets  of  relationships. 

For  example,  the  second  column  indicates  that,  with  all  other 
activities  corresponding  to  the  ideal  and  with  no  changes,  less  than  ideal 
effort  spent  on  coding  and  checkout  would  be  expected  to  cause  a higher 
error  rate  of  the  delivered  software  than  the  ideal.* 

In  conclusion:  the  development  of  low-risk,  practical  life-cycle 
cost  estimating  relationships  requires  the  consideration  of  the  inter- 
actions among  the  activities  or  phases.  Furthermore,  we  postulate  that 
any  analysis  that  does  not  include  these  interactions  will  not  succeed 
in  reducing  the  scatter  that  makes  existing  software  cost  estimating 
schemes  unsuitable  for  effective  project  planning  and  control. 

Recognition  of  the  interactions  poses  some  practical  problems 
in  data  collection  and  Interpretation  that  must  be  solved  before  valid 
data  will  be  obtained.  These  problems  are  discussed  in  the  next 
section. 
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One  might  argue  that  the  "ideal"  error  rate  would  be  sero;  but  a 
practical  solution  would  be  to  avoid  spending  large  amounts  of  resources 
to  achlsve  zero  errors.  Therefore,  it  would  be  expected  that  proper 
planning  would  allow  for  some  small  acceptable  error  rate. 
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5 EVALUATING  RELATIONS  BETWEEN  ACTIVITIES 


Gathering  enough  data  to  evaluate  the  hypotheses  proposed  In 
Sec.  4 poses  a problem.  We  had  collected  data  from  the  Air  Force  Data 
System  Design  Center  (AFDSDC)  to  test  the  relationships  hypothesized 
In  Sec.  3.  This  data  base  was  selected  because  It  contained  detailed 
Information  on  resource  consumption,  both  planned  and  actual,  for  a 
number  of  software  developments  that  used  a common  reporting  system  and 
the  same  programming  language  to  produce  software  of  the  same  generic 
type.  With  such  a data  base  we  could  control  for  a nun^er  of  variables 
that  affect  the  man-hours. 

When  the  data  was  tested,  however,  variances  were  large  and  It 
became  clear  that  the  model  of  Sec.  3 was  too  simplistic.  The  hypotheses  j 

that  were  stated  In  Sec.  4 evolved  from  studying  this  data.  Unfortu-  j 

nately,  the  number  of  usable  data  points  for  testing  these  hypotheses  | 

Is  small,  and  the  data  base  will  have  to  be  greatly  expanded  to  yield  j 

statistically  significant  results.  However,  the  data  base  Is  large 
w enough  to  show  the  potential  of  this  approach.  This  potential  Includes 

the  development  of  rules  for  the  optimal  allocation  of  man-hours.  In 
addition  to  the  development  of  estimating  relationships. 

In  this  section  we  will  first  describe  the  data  base  we  are  using 
and  Its  origin.  Next,  the  data  will  be  used  directly  to  test.  In  a 
largely  qualitative  fashion,  some  of  the  relationships  stated  In  Sec.  4. 

Finally,  we  will  demonstrate  how  the  data  can  be  used  to  optimize 
resource  allocations  among  activities  or  phases.  Although  the  data  base 
is  too  small  to  produce  much  confidence  In  the  specific  curves  we 
show  as  examples,  we  believe  that  the  technique  and  Its  potential  are 
demonstrated. 

i 
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S.l  DATA  DESCRIPTION 


5.1.1  The  Air  Force  Data  Systeas  DcsIot  Center  (AFDSDC) 

The  Data  Syatena  Design  Center  at  Gunter  Air  Force  Station,  Alabama, 
haa  approximately  1300  personnel,  800  to  900  of  whom  are  analysts  and 
prograamers.  The  mission  of  the  Design  Center  Is  to  develop  and  maintain 
all  Air  Force  standard  automated  data  processing  systestt;  the  Center  does 
not  have  authority  over  systems  that  are  unique  to  a command. 

The  Center  Is  organized  Into  functional  directorates,  each  headed 
by  a person  experienced  In  that  functional  area.  The  Center  supports 
over  lAO  data  processing  Installations  %rorldwide.  At  the  Center  there 
are  three  computer  systems  which  can  simulate  any  of  the  computer 
configurations  used  at  these  Installations  to  Implement  the  ADP 
systems  developed  at  the  Center.  The  primary  system  Is  the  Burroughs 
B3500,  which  Is  called  the  base  level  computer.  Supply  systems  are 
Implemented  on  the  Unlvac  1050.  The  Honeywell  H6000  Is  used  for 
applications  at  major-conaund  Installations.  All  applications  are 
written  In  COBOL. 

Requests  from  the  different  commands  for  new  automated  data  proces- 
sing systems  are  coordinated  by  the  Air  Staff,  which  authorizes  the 
Design  Center  to  work  up  a data  processing  plan  for  each  approved 
proposal.  A preliminary  analysis  is  completed  by  the  directorate  having 
primary  responsibility  for  the  functional  area,  and  the  plan  Is  pre- 
pared. The  plan  Includes  estimates  of  resources  required  for  its  devel- 
opment, lBq>lementatlon,  and  operation.  Upon  approval  of  the  plan  the 
directorate  Is  responsible  for  developing.  Implementing,  and  maintaining 
the  system. 

When  the  developing  organization  has  completed  the  development  of 
a system  or  completed  a major  modification,  the  system  is  turned  over 
to  the  Directorate  of  Systems  Control  for  testing,  the  Quality  Control 


and  Field  Assistance  Unit  completes  up  to  cwo  levels  of  testing  and  main- 
tains liaison  between  the  developing  organization  and  the  field  idien 
operational  problems  occur. 


The  first  test  phase.  Environmental  Systems  Test-I  (EST-I) , is 
completed  at  the  Design  Center  and  is  intended  to  be  an  extensive  checkout 
of  the  system  in  a simulated  operational  environment.  For  small  systeaw 
EST-I  is  the  end  of  testing  before  release.  More  complex  new  systems, 
or  those  subject  to  the  National  Privacy  Act,  are  assigned  to  EST-II. 

This  includes  a complete  testing  and  implementation  plan  and  is  conducted 
at  from  two  to  ten  other  installations  to  test  the  system  in  an  operating 
environment . 

There  are  two  aspects  to  the  tests:  operational  and  functional. 
Operational  testing  means  that,  given  certain  data,  the  system  will 
produce  the  specified  reports  and  other  actions.  Functional  testing  is 
designed  to  verify  that  the  information  produced  by  the  system  is  that 
desired  by  those  who  will  use  the  system.  Testing  is  considered  complete 
when  the  Branch  Chief  receives  from  the  test  sites  letters  of  certifica- 
tion on  the  operation  and  functioning  of  the  system. 

5.1.2  The  Planning  and  Resource  Management  Information  System  (PARMIS) 

Software  development  projects  at  the  Design  Center  are  supported 
by  an  automated  system  called  the  Planning  and  Resource  Management  Infor- 
mation System  (PARMIS).  The  system  is  a project  planning  and  management 
aid  that  enables  project  managers  to  enter  estimated  schedules  and  man- 
power allocations  and  to  receive  regular  reports  on  actual  utilizations. 

PARMIS  is  activity-oriented.  A project  manager  establishes  a new 
project  on  PARMIS  by  estimating  start  and  end  times  and  man-hours  for 
each  activity.  The  breakdown  of  the  project  into  specific  activities 
is  to  a great  extent  up  to  the  discretion  of  the  manager.  He  may  make 
the  activities  broad  or  narrow,  according  to  the  size  and  complexity  of 
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the  project  and  the  nuaber  of  persons  involved.  The  selection  of 
activity  titles  Is  facilitated  by  the  PARMIS  Catalog,  which  is  a pre- 
progrsaswd  breakdown  of  activities  to  different  levels  of  detail.  One 
restriction  Inposed  recently  Is  that  the  activities  must  be  consistent 
with  the  Design  Center's  standard  life-cycle  model.  The  PARMIS  Cata- 
log's designation  of  activity  codes  and  activity  group  codes  laafces  It 
possible  to  separate  the  development  activities  Into  activity  groups. 
The  activity  group  codes,  presented  In  Table  5.1,  were  Instruaental  In 
assigning  the  resource  utilisation  data  to  the  process  model  used  for 
this  study. 


PARMIS  accepts  planned  and  actual  man-hours  under  four  job  classi- 
fications: Functional  Analyst,  Data  Systems  Analyst,  Programmer,  and 
Support  Personnel.  Actual  expenditures  of  time  are  reported  by  activity 
and  job  classification.  Both  expected  and  actual  start  atid  completion 
dates  are  reported  by  the  system  for  each  activity.  The  project  manager 
is  free  to  revise  his  dates  and  man-hour  estimates,  but  the  system  always 

maintains  and  reports  the  estimates  made  when  the  project  was  entered  Into 
* 

the  system. 

Actual  expenditures  of  time  are  reported  weekly.  Each  project 
manager  Is  responsible  for  submitting  reports  of  hours  expended  by  person 
and  activity.  Regular  reports  are  prepared  at  several  levels  beginning 
with  the  project  level  and  extending  to  the  Center  level.  When  projects 
are  completed,  they  are  removed  from  the  active  data  base  and  stored  on 
history  files. 


The  relationship  between  estimated  and  actual  man-hours  has  been  pre- 
viously Investigated  by  Lt.  Col.  Gehrlng^S.  In  particular,  he 
correlated  accuracy  In  predicting  specific  activities  with  accuracy 
In  predicting  total  effort. 
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TABLE  5.1 

PARMIS  ACTIVITY  GROUP  CODES 


XOlO  CoftMptlon 

XOll  Initial  Procaaalnn 
X020  ADP  Syataa  Managar  Evaluation 

X030  Evaluation  of  Data  Procaaalns 

XaqulraMnta 

X040  Pralialoary  Raquirasanta  Oaflnltlon 

X050  Pralialnary  XaquirtMnta  Ravlaw 


Daflnitlon  Phaaa 

X060  Daflnitlon  of  RaqulraMnta 

X061  Praparation  of  Data  ProcaaalnK  Plan 
X062  Functional  Daaerlptlon 
X063  Altamatlva  Concapta 
X06A  EconoBlc  Analyala 
X065  Praparation  of  SRR  Docuaant 
X066  Coordination' of  Functional  Daaerlptlon 
X070  Syatan  Ra<|ulTaiMnta  Ravlaw  (SRR) 


XlOO  Prograa  Davaloonant 

XlOl  PrograaMlnt 
X102  Prograai  Spaclflcatlons 
X103  Syataa  Taating 
X104  Prograa  Davalopaant  DocuMnt 
XIOS  Taat  Plan 

X106  Coaputar  Oparatlon  Manual 
X107  Prograa  Malntananca  Manual 
X108  Prallalnary  laplaaantatlon  Raqulranants 
X109  Daar'a  Manual 


XllO 

Technical  Teat  Review 

X120 

Syataa  Statua  Ravlaw 

Teat  Phaaa 

XI 30 

Envlronaental  Syataa  Taat  1 

X140 

Environmental  Syataa  T«at  II 

X150 

Syatea  Validation  Review 

X160 

Worldwide  Releaac 

Daalgn 


X081  Updatad  Data  Procaaalng  Plan 
X082  Syataa/Subayataa  Spacif Icatlona 
X083  Data  Baaa  Spaclflcatlona 


X084  Data  Raqulraatnta  Docuaant 


X085  Praparation  of  SDR  Docuaant 


X086  Hardwara  Spaclflcatlona 
X090  Syataa  Daalgn  Ravlaw  (SDR) 


paratlona  Phaaa 


Syataa  laplaaiantatlon 
X175  Pinal  Oparational  Evaluation 


Other 

X350  . Continuoua  Projacta 

X3S1  Minor  (Pangea 
X3S2  Individual  Minor  Projacta 
y360  Futura  Projacta 

X400  AF-Provldad  Softwara 

X410  Vandor^-Provldad  Software 


5.1.3  Description  of  the  Data  Base 

PASMIS  has  been  In  use  at  the  Design  Center  since  1970.  Since 
that  tlae  more  than  2000  project  summaries  have  been  accumulated  In  the 
history  files. 

The  History  File  for  each  project  is  the  activity-level  file  as 
It  existed  when  the  last  activity  was  reported  completed.  There  is  a 
separate  record  for  each  activity  and  each  activity  group.  Table  5.2 
lists  the  elements  of  the  History  File.  History  Files  are  maintained 
on  a fiscal  year  basis.  A project  is  entered  into  the  History  File  for 
the  fiscal  year  In  which  Its  last  activity  was  completed. 

The  History  Files  reside  on  magnetic  tapes.  The  preparation 
of  special  reports  Is  greatly  facilitated  by  a special  report  generator 
system. 

Projects  to  be  Included  in  this  study  were  selected  by  first 
examining  summary  reports  on  all  projects  completed  during  Fiscal  Years 
1975  and  1976.  Special  reports  on  candidate  projects  were  then  prepared 
by  using  the  History  Files  and  the  report  generator. 

5.1.4  Data  Collection  Procedure 

Collection  of  the  project  data  was  a three-step  process.  First, 
detailed  information  was  collected  for  candidate  projects.  This  included 
details  about  estimated  and  actual  dates  and  man-hours  for  each  activity 
in  each  project.  Second,  a detailed  questionnaire  was  prepared  and 
interviews  were  conducted  with  project  managers  to  obtain  information 
not  available  in  PARKIS:  program  size,  management  techniques,  documen- 
tation, and  other  items  describing  the  product  and  its  development 
environment.  Finally,  the  error  reports  maintained  by  the  Field  Assistance 
Branch  were  studied  to  determine  the  numbers  and  types  of  errors  reported 
by  users  and  when  they  occurred. 


TABLE  5.2 

SUMMARY  OF  HISTORY-FILE  DATA  ITEMS 
(EACH  ACTIVITY) 


1. 

Project  originator  number 

21. 

All  successor  activities 

2. 

Activity  group  number 

(control  and  activity  numbers) 

3. 

Control  number 

22. 

Activity  description 

4. 

Activity  number 

23. 

Start  date 

5. 

Activity  description 

24. 

New  start  date 

6. 

ADP  system  number 

25. 

Estimated  completion  date 

ADS  number 

26. 

New  estimated  completion  date 

7 . 

8. 

Management  category 

27. 

Actual  completion  date 

9. 

Milestone  number 

28. 

Plan  change  date 

10. 

Type  of  computer 

29. 

Span  days 

11. 

Work  category 

30. 

New  span  days 

12. 

Data  systems  designator 

31. 

Remaining  span  days 

13. 

nu^er 

System  code 

32. 

Estimated  man-hours  by  skill 
and  total 

14. 

Type  of  system 

33. 

New  estimated  man-hours  by 
skill  and  total 

15. 

Program  action  code 

34. 

Expended  man-hours  by  skill 

16. 

Program  nuiid>er 

and  total 

17. 

Schedule  Indicator 

35. 

Monthly  expended  man-hours  by 

18. 

Responsible  Individual 

skill  and  total 

for  data 

36. 

Current  exnended  man-hours 

19. 

Privacy  key 

(since  plan  change  date)  by 
skill  and  total 

20. 

All  predecessor  activities 
(control  and  activity 
numbers) 

5-7 


’ PABMIS  Data  Collectloa.  Sumary  reports  from  the  history  files 
for  Fiscal  Years  1975  and  1976  (year-to-date)  were  examined.  These 
reports  contain  start  and  completion  dates  and  estimated  and  actual  man- 
hours by  job  classification  and  by  activity  group  for  each  project. 
Personnel  in  the  Project  Management  Division,  Operations  Branch  provided 
consultation  on  use  of  the  history  files  and  prepared  the  coiq>uter  runs 
for  obtaining  the  project  summaries.  The  following  criteria  were  used 
to  select  projects  for  the  study. 

1.  Project  completed  between  January  1974  and  April  1976. 

This  was  to  Insure  that  some  error  history  would  be 
available,  but  that  the  project  was  not  completed  so  long 
ago  that  getting  management  information  would  be  difficult. 

2.  Activity  descriptions  including  entire  software  development 
life  cycle. 

3.  Projects  greater  than  2,000  actual  man-hours  and  six  months 
duration. 

Each  project  is  identified  on  the  history  file  by  its  ten-character 

Project  Originator  Ntmiber  (PON)  which  is  established  at  the  time  the 

project  is  authorized.  Using  this  number  as  the  primary  search  key,  a 

special  file  was  created  and  sorted  by  PON,  contributing  organization, 

activity  groups,  and  control  number.  This  sequencing  had  the  effect  of 

separating  each  project  by  organization  and  development  phase.  Using 

the  activity  group  code  and  control  number  as  breakpoints,  totals  of 

estimated  and  actual  hours  were  automatically  prepared  for  each  develop- 
* 

ment  phase.  Figure  5.1  shows  a page  from  the  special  history  report. 

Project  Manager  Data.  The  personnel  of  the  Project  Analysis 

Branch  obtained  the  names  of  the  project  managers  for  the  20  projects 

— 

In  some  projects  the  project  managers  used  slightly  different  definitions 
of  the  development  phases.  In  these  cases  revised  totals  were  calcu- 
lated manually. 


Figure  5.1.  Sample  Printout  From  the  PARMIS  Special  History  Report 


whose  sunnnarles  were  obtained  from  the  PARMIS  data  base.  They  coordinated 
Initial  interviews  with  the  managers  (or  other  persons  familiar  with  the 
projects)  and  circulated  copies  of  the  questionnaires  (see  Appendix  A) . 


About  a week  after  the  questionnaires  were  distributed,  they  were 
collected  and  the  contents  were  discussed.  In  some  cases  problems  were 
discovered  and  entries  were  changed  or  new  data  were  indicated.  Several 
respondents  needed  additional  time  to  complete  the  forms. 

In  all,  17  completed  forms  were  obtained.  Of  the  three  other 
systems  for  which  information  was  requested,  one  had  been  replaced  by  a 
newer  system  and  all  related  records  destroyed;  the  other  two  were 
modifications  to  a large  logistics  system,  and  their  records  were  not 
separable  from  those  of  the  primary  system  and  therefore  were  not  collected 
for  this  study. 

Of  the  17  forms  that  were  returned,  10  were  missing  program-size 
information.  This  information  is  not  usually  recorded,  and  the  systems 
for  which  we  were  collecting  the  data  had  been  modified  to  the  extent 
that  the  present  program  sizes  were  not  indicative  of  the  originals. 


Therefore,  of  the  20  systems  for  which  information  was  requested, 
usable  data  were  obtained  for  only  seven. 

Error  Data  Collection.  Reports  of  system  difficulties  are  pro- 
cessed by  the  Field  Assistance  Branch  of  the  Systems  Control  Directorate. 
Each  trouble  call  or  difficulty  report  on  an  operational  system  is  logged. 
Referrals  are  made  to  the  directorate  responsible  for  system  maintenance, 
and  follow-up  contacts  are  maintained  until  the  report  is  determined  to 
be  false  or  a duplicate  of  a previously  reported  error,  or  until  a 
correction  is  released.  A very  complete  description  of  all  reported  prob- 
lems and  their  disposition  is  maintained.  Summary  reports  are  released 
regularly. 
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Unfortunately,  records  on  Individual  systems  are  maintained  for 
only  the  preceding  12  months.  As  a result,  error  reports  were  not 
available  for  one  of  the  systems.  In  addition,  it  was  learned  that 
errors  in  another  system  are  not  reported  through  the  Field  Assistance 
Branch  because  the  system  is  used  only  at  the  Design  Center.  This  left 
error  i ummarles  for  five  of  the  seven  systems. 


5.1.5 


The  data  obtained  from  PARMIS  describing  activity  dates  and  hours 
should  be  of  as  high  quality  as  can  be  obtained  for  software  development 
analysis.  It  was  constructed  according  to  well-established  definitions 
and  procedures  of  long  standing.  It  was  recorded  weekly  as  it  happened, 
and  reflects  activity  definitions  that  are  directly  applicable  to  our 
analyses . 

Errors  in  the  PARIflS  data  could  come  from  poor  management  report- 
ing, misuse  of  activity  definitions  in  establishing  and  reporting  projects, 
misleading  representations  of  project  status  submitted  to  hide  slippages, 
and  dumping  of  idle  time  into  active  projects.  However,  these  errors 
would  exist  in  any  project  reporting  system  and  should  be  minimized  by 
the  procedures  in  effect  at  the  Design  Center. 

The  error  reporting  data  are  part  of  a very  extensive  system  of 
quality  control  in  effect  at  the.  Desgln  Center.  The  complete  logging 
and  follow-up  of  each  difficulty,  made  by  a unit  that  is  separate  from 
the  developing  unit.  Insures  the  quality  of  these  data. 

The  project  manager  (questionnaire)  data  is  the  weakest.  The 
questions  asked  for  very  detailed  information  that  is  not  part  of  the 
records  kept  by  the  managers.  Furthermore,  military  personnel  are  trans- 
ferred frequently,  and  several  of  the  project  managers  had  been  trans- 
ferred. The  Design  Center  uses  a project-oriented  organizational 
structure  within  the  directorates,  and  therefore  it  was  sometimes 
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difficult  to  obtain  Inforaatlon  because  persons  who  worked  on  the  system 
were  scattered. 


f ; 


f I 


I 


However,  we  did  locate  knowledgeable  individuals  for  each  project, 
and  through  their  cooperation  and  patience  it  was  possible  to  locate  the 
records  required.  Old  program  listings  were  searched  for  program  size 
information,  notes  were  dusted  off,  and  telephone  discussions  were  held 
with  other  persons  who  had  worked  on  the  projects.  By  this  process  it 
was  possible  to  obtain  much  of  the  data  that  was  asked  for. 

5.1.6  Data  Conditioning 

The  results  of  the  data  collection  effort  are  presented  in  Table 
5.3.  The  first  19  items  for  each  project  were  derived  from  the  PARMIS 
history  files;  items  20  through  27  were  obtained  from  the  project-manager 
questionnaires  and  item  28  from  the  error  reports;  and  items  29  through  34 
were  derived  from  the  other  items. 

Estimated  and  actual  values  were  determined  from  the  activity-level 
printouts  from  the  history  files.  Activity  descriptions  were  checked  to 
assure  that  the  activities  were  Included  in  the  phase  that  had  been  estab- 
lished by  this  study’s  process  model. 

In  some  projects,  the  managers  had  entered  activities  designated 
as  PDR  or  CDR.  The  dates  were  recorded  for  these  events.  Otherwise, 
the  dates  for  completion  of  the  analysis  activities  and  the  design  activi- 
ties, respectively,  were  recorded  as  the  PDR  and  CDR  dates. 

To  establish  the  hours  of  analysis  and  design  completed  before  the 
start  of  coding,  all  the  coding  activities  were  scanned  to  determine  the 
earliest  actual  starting  date.  The  estimated  and  actual  hours  for  anal- 
ysis and  design  sctivitles  before  this  date  were  summed  to  arrive  at 
that  entry.  If  an  analysis  or  design  activity  spanned  the  start-coding 
date,  it  wa^  proportioned  between  the  "before"  and  "after"  totals, 
asr  istant  level  of  effort. 
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TABLE  5.3  (Contd.) 


Several  of  the  systems  contained  parts  that  were  taken  from  pre- 
viously developed  systems.  Also,  some  of  the  analysis  and  design  had 
been  done  under  previous  projects.  Adjustments  to  the  raw  data  were 
made  to  relate  all  the  analysis  and  design  and  coding  and  testing  hours 
to  the  new  product  developed  and  not  necessarily  the  end  or  delivered 
product. 

Corrections  for  Existing  Analysis  and  Design.  Systems  1,  3,  4, 
and  5 used  existing  analyses  and  designs  to  develop  parts  of  the  pro- 
grams. The  proportions  are  given  for  each  case  (lines  26  and  27).  Of 
the  total  source-code  lines  delivered  (line  24) , only  a portion  were 
derived  from  the  man-months  of  effort  shown  In  lines  10  and  11.  The 
percentages  given  In  lines  26  and  27  were  used  to  make  the  "effective" 
source-code  lines  shown  In  lines  29  through  31  consistent  with  the  level 
of  effort.  These  data  were  used  to  relate  the  analysis  and  design  hours 
to  program  size  (see  Fig.  5.2).  The  procedure  used  In  making  the  correc- 
tions Is  described  next. 

' Corrections  for  Program  Size.  In  the  following  analyses,  resources 

expended  during  the  different  development  activities  will  be  compared.  We 
will  attempt  to  compare  man-months  of  time  required  to  complete  each 
activity,  design  changes,  and  errors,  with  product  descriptors  In  order  to 
discover  If  quantitative  relationships  can  be  established.  To  make  such 
comparisons.  It  Is  necessary  to  have  some  measure  of  the  product;  we  have 
selected  program  size  as  the  single  measure.  Since  we  want  to  compare 
the  different  resource  requirements  for  a given  end  product.  It  Is 
necessary  that  all  the  resources  for  a given  program  development  be  con- 
sistent with  that  measure.  Unfortunately,  some  of  the  programs  In 
Table  5.3  Incorporated  existing  code  or  designs.  For  those  that  Included 
existing  code,  only  the  new  code  (line  25)  was  counted,  since  this  Is  the 

^ product  produced  by  the  development  resources  expended. 


A 
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In  the  case  of  existing  design  or  analysis.  It  was  reasoned  that 
the  recorded  man-months  were  not  as  much  as  would  have  been  needed  to 
produce  the  new  lines  of  code  from  scratch.  Therefore,  we  adjusted  the 
hours  of  design  or  analysis  so  that  they  represented  what  would  have 
been  required  to  produce  the  new  lines  of  code.  Only  by  making  these 
corrections  was  It  possible  to  make  comparisons  among  the  different 
activities  for  the  same  program  development.  These  adjusted  values  are 
used  only  for  examining  the  analysis  and  design  trade-offs  with  pro- 
gramming and  testing  presented  In  Sec.  5.3;  uncorrected  values  are  used 
to  establish  the  baselines  for  each  development  activity 

Two  methods  were  considered  for  adjusting  the  analysis  and  design 
man-months  for  program  size.  One  possibility  Is  to  use  the  percentage 
figures  on  each  project  to  calculate  the  effort  that  would  have  been 
required  If  analysis  and  design  had  been  done  from  scratch.  For 
example:  suppose  a program  has  2,000  new  lines  of  source  code,  and  two 
man-months  were  spent  in  design,  and  50  percent  of  the  program  was  based 
on  an  existing  design.  Then  the  reported  design  man-months  were  asso- 
ciated with  the  production  of  1,000  lines  of  code,  and  we  would 
calculate  a rate  of  two  man-months  per  1,000  lines  of  code.  If  we 
accept  this  method  of  correction,  we  would  describe  the  "effective" 
design  time  for  the  2,000  lines  of  new  code  as  four  man-months. 

The  problem  with  this  method  Is  that  it  greatly  Increases  the 
analysis  and  design  hours  when  the  new  lines  are  only  a small  part  of 
the  programs.  But  It  Is  reasonable  to  expect  that  the  relationship 
Is  not  linear  for  small  analysis  and  design  efforts;  that  Is,  some  | 

amount  of  analysis  and  design  Is  necessary,  however  little  new  code  i 

is  to  be  written.  ] 
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Because  of  this,  we  chose  to  use  the  slope  of  the  regression  line 
In  Fig.  5.2  (one  nan-aonth  per  1,000  lines)  as  an  indication  of  the 
rate  of  increase  of  analysis  and  design  time  with  program  sixe.  This 
measure  has  the  advantage  of  being  representative  of  a group  of  measure- 
ments, rather  than  an  individual  case,  and  is  a store  likely  predictor 
of  the  rate  of  change.  The  rate  of  change  is  applied  to  the  reported 
months  in  the  following  way.  Using  the  previous  example,  the  reported 
two  man-sK>nths  of  analysis  and  design  required  to  produce  1,000  lines 
of  code  would  be  Increased  by  one  man-month  to  obtain  an  "effective" 
expenditure  for  2,000  lines  of  new  code.  This  would  result  in  an 
i effective  analysis  and  design  effort  of  three  man-months,  compared  to 

the  four  man-months  obtained  by  using  the  first  method. 

Figure  5.2  shows  the  analysis  and  design  time  for  a number  of 
programs,  plotted  against  lines  of  source  code.  The  line  results  from 
a regression  analysis  with  the  line  forced  through  the  origin.  Data 
point  5 was  not  included  in  the  regression.  The  slope  of  the  line 
(1.0  MM/ 1000  lines)  is  the  same  as  that  obtained  in  the  NASA  study. 

This  gives  some  confidence  that  the  slope  represents  the  linear  rela- 
tionship between  lines  of  source  code  and  man-months  of  analysis  and 
design. 

Lines  32  and  33  of  Table  5.3  were  obtained  by  adjusting  the 
reported  man-months  for  analysis  and  design  (lines  17  and  18).  The 
adjustment  of  data  point  3 will  be  described  to  illustrate  how  the 
entry  for  line  32  was  obtained.  (The  corresponding  entry  for  line  33 
is  obtained  in  the  same  manner  except  that  the  entry  on  line  18  is 
used  instead  of  line  17.) 
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The  delivered  product  has  5,100  new  lines  of  source  code  (line  25). 
According  to  the  questionnaire,  75  percent  of  the  code  was  written 
using  an  existing  Part  1 specification  (line  26)  and  25  percent  was 
written  from  an  existing  Part  II  specification  (line  27).  Therefore, 
the  reported  man-months  for  analyls  and  design  represent  the  effort 
required  for  end  products  of  1,300  and  3,800  lines  of  source  code 
(lines  29  and  30).  The  mean  value,  2,550  lines  of  code.  Is  used  to 
represent  the  combined  analysis  and  design  effort  (line  31) . 

As  was  described  above,  the  "effective"  analysis  and  design  man- 
months  consistent  with  the  end  product  value  of  5,100  lines  of  code  were 
then  obtained  by  calculating  the  additional  effort  that  would  have  been 
required  to  produce  the  program  If  none  of  the  analysis  and  design  had 
existed  at  the  start.  An  additional  analysis  and  design  effort  equal  to 
that  required  to  produce  2,550  lines  of  source  code  (5,100  - 2,550) 
would  have  been  required.  At  a rate  of  one  man-month  per  1,000  lines, 
this  would  require  2.55  additional  man-months.  Therefore,  an  effective 
value  of  16.28  man-months  (13.73  + 2.55)  is  entered  in  line  32.  This 
value  is  used  In  subsequent  analyses  when  analysis  and  design  time  Is 
compared  with  coding  and  testing  time. 


Corrections  for  Programming  Language.  All  the  systems  except 
number  1 are  written  In  COBOL;  System  1 is  written  principally  in 

assembly  language.  Our  results  from  studying  the  ADPREP  data  (Fig.  6.2)  ] 

were  used  to  correct  the  man-months  presented  for  System  1 and  to  make 

them  consistent  with  the  other  data.  Accordingly,  man-months  for  | 

System  1 are  multiplied  by  0.54  (see  equations  following  Fig.  6.2)  when  | 

they  are  compared  with  the  other  systems.  | 

i 

5.2  EVALUATING  RELATIONS  AMONG  ACTIVTTIES  I 

The  presentation  of  the  Air  Force  Data  Systems  Design  Center  data  ^ 

Is  designed  to  support  the  study  of  the  relations  among  activities 
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discussed  In  Sec.  4.  In  accordance  with  the  previous  discussion*  we 
are  atteaq>tlng  to  show  the  following  relationships: 

1.  Extending  the  analysis  and  design  activities  beyond  the 
start  of  the  coding  activities  Increases  the  probability 
of  smre  hours  of  prograasilng  and  testing,  or  Increased 
operational  errors. 

2.  Increasing  analysis  and  design  time  tends  to  decrease  the 
prograanlng  and  testing  time  or  the  number  of  operational 
errors . 

We  are  attempting  In  these  comparisons  to  demonstrate  how  one  activ- 
ity of  the  software  life  cycle  Is  affected  by  and  in  turn  affects  other 
activities.  Obviously,  most  of  these  relationships  cannot  be  analyzed 
Independently.  That  Is,  we  cannot  compare  analysis  and  design  tlaie  with 
programming  and  testing  time  without  at  the  same  time  considering  the 
number  of  operational  errors.  As  was  Indicated  by  Fig.  4.6,  there  are 
many  components  to  each  comparison.  With  the  limited  data  that  are  avail- 
able It  will  not  be  possible  to  test  the  various  relationships  rigorously. 
In  this  section  we  attempt  to  show  that  the  basic  relationships  Indicated 
by  the  hypotheses  are  supported  by  the  available  data.  In  the  next 
section  we  will  then  show  what  quantitative  results  can  be  obtained  and 
by  so  doing  Indicate  the  direction  for  future  work. 

Figures  5.3  to  5.8  are  plots  of  the  man-months  expended  for  the 
various  activities  of  each  of  the  seven  projects  against  program  size.  The 
clearest  conclusion  that  can  be  drawn  from  these  plots  is  that  the  wide 
scatter  (particularly  In  the  Analysis  and  Design  plots)  makes  It  futile 
to  attempt  to  derive  estimating  relationships  for  the  separate  activities. 
The  fit  lines  shown  In  Figs.  5.3,  5.4,  and  5.7  will  be  discussed  later. 

This  scatter  Is  really  quite  surprising.  We  remind  the  reader 
that  these  seven  data  points  represent  projects  all  completed  under  the 
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Before  Coding 


Figure  5.5.  Coding 
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sane  development  philosophy  at  the  sane  agency;  programmed  In  the  same 
language  (except  one,  which  has  been  adjusted  for  the  language  difference) 
and  designed  for  similar  applications.  Adjustments  have  been  made  for 
projects  that  built  on  previous  analysis  and  design  efforts,  with  the 
data  representing  only  the  new  coding  effort.  If  independent  estimates 
of  the  resources  required  for  each  life  cycle  activity  could  be  supported 
by  any  data  base,  it  should  be  this  one.  We  therefore  conclude  that  the 
development  of  independent  resource-estimating  relationships  for  each 
activity  does  not  offer  any  promise  as  a means  of  estimating  resource 
requirements  for  software  development. 

This  brings  us  to  the  second  conclusion  that  can  be  drawn  from  the 
figures.  On  examining  all  the  figures  taken  together,  it  is  clear  that 
the  relations  between  activities  are  important,  and  that  trade-offs 
between  the  activities  do  occur. 

Perhaps  this  is  easiest  to  see  by  ignoring  the  program  sizes  for 
the  moment,  and  simply  ranking  the  seven  programs  in  order  of  man-months 
for  each  activity.  Such  an  ordering  is  shown  in  Table  5.4.  Note  how 
the  order  changes  from  activity  to  activity. 


TABLE  5.4 

MAN-HOUR  RANK  ORDER  OF  DATA  POINTS  IN  EACH  ACTIVITY 
(In  Order  of  Decreasing  Man-Hours) 


Analysis 

& Design 

Total 

Prior  to 
Coding 

Total 

Coding 

Test 

Coding 
and  Test 

Error 

Rate 

6 

6 

6 

4 

6 

1 

7 

7 

7 

2 

4 

6 

3 

2 

2 

1 

7 

7 

2 

3 

1 

6 

2 

3 

5 

4 

4 

7 

1 

2 

4 

5 

5 

3 

5 

? 

1 

1 

3 

5 

3 

1 
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Data  point  1 la  loweat  for  analyaia  and  daalRn,  In  tha  aiddle  for 
programing  and  testing,  and  highest  for  error  rate.  Data  point  2 Is 
about  In  the  middle  of  the  programs  except  for  testing,  where  It  Is  high, 
and  error  rate,  where  It  Is  low.  Data  point  3 Is  In  the  middle  for 
analysis  and  design,  low  for  coding  and  test,  and  also  low  for  error 
rate.  Data  point  4 Is  low  for  analysis  and  design  and  for  coding,  and 
highest  for  testing.  Data  point  5 Is  low  for  everything.  Data  points 
6 and  7 are  high  for  analysis  and  design  and  for  coding,  medium  for  testing, 
and  high  for  errors. 

From  this  simplified  comparison,  it  Is  obvious  that  the  man-hour 
requirements  are  not  consistent  across  activities  (as  would  be  the  case 
if  size  were  the  only  driving  parameter)  and  that  there  are  apparently 
trade-offs.  Therefore,  utilizing  the  figures,  we  tried  to  qualitatively 
evaluate  the  hypotheses  posed  on  page  5-20. 

In  order  to  test  the  hypotheses,  however.  It  was  necessary  to 
establish  reference  lines  that  could  be  used  to  distinguish  increases  and 
decreases  In  the  variables  to  be  tested.  These  reference  lines  should 
represent  an  Ideal  allocation  of  resources  between  the  activities  so 
that  departures  from  the  Ideal  can  be  Identified. 

The  nominal  level  of  analysis  and  design  hours  as  a function  of 
program  size  was  determined  by  calculating  a regression  line  (forced  to 
go  through  the  origin)  for  the  total  analysis  and  design  time  (Fig.  5.4). 
Differences  between  specific  data  points  for  time  expended  before  and 
after  the  start  of  coding  were  assumed  to  Indicate  departures  from  the 
norm  for  a given  program  size. 

In  Fig.  5.7,  showing  total  programming  and  testing  time,  the 
nominal  level  was  taken  to  be  the  regression  line  determined  for  the 
four  data  points  for  which  no  design  changes  (ECPs)  were  reported  during 
development  (numbers  2,  3,  6,  and  7). 


5-26 


Coaparlng  Flg«.  5.3  and  5.4  Indicates  that  four  data  points  (1, 

2,  4,  7)  represent  systea  developaents  In  which  the  analysis  and  design 
activities  continued  after  the  start  of  the  coding  activities.  Figure 
5.7  shows  that  data  points  1 and  2 lie  on  the  noalnal  prograanlng  and 
testing  line  while  4 and  7 are  below  It.  According  to  the  first  hypothe- 
sis (p.  5-20),  then,  all  four  points  should  show  relatively  high  error 
rates.  Figure  5.8  Indicates  that  point  1 is  high,  error  data  for  4 Is 
not  available,  and  2 and  7 are  low — which  does  not  support  the  hypothesis. 

Hypothesis  2 holds  that  Increasing  the  Investment  In  analysis  and 
design  decreases  programming  and  testing  time  or  operational  errors. 
(Obviously  this  can  only  be  true  to  a point.  After  the  problem  has  been 
properly  defined  and  a detailed  design  completed,  additional  analysis 
and  design  hours  are  a waste.) 

Figure  5.4  shows  two  data  points  with  higher  Investments  in  analy- 
sis and  design  (3,  6)  and  two  with  less  (4,  5).  Of  these,  3 has  a nominal 
progr aiming  and  testing  time,  6 is  high,  4 is  slightly  low,  and  5 very 
low.  The  hypothesis  would  indicate  that  the  error  rates  for  3 and  6 
should  be  nominal  or  less,  and  those  for  4 and  5 should  be  higher  than 
the  norm.  The  error  data  supports  the  second  hypothesis  for  points  3 
and  6;  error  data  for  points  4 and  5 are  missing. 

Thus  this  analysis  has  not  In  general  supported  the  hypotheses. 
However,  note  how  this  form  of  Investigation  Is  dependent  on  the  selection 
of  Ideal  resource  allocation  lines.  Suppose,  for  example,  that  the 
Ideal  expenditure  of  coding  and  testing  man-hours  was  that  shown  by  the 
"alternative"  line  in  Fig.  5.7.  In  that  case,  point  3 could  be  explained 
as  being  low  due  to  "overdesign;"  points  2,  4,  and  7 as  being  high  due 
to  beginning  coding  before  the  end  of  analysis  and  design;  point  1 (and 

also  4)  as  being  high  due  to  design  changes  ; and  point  4 as  also  being 

_ _ 

Points  2,  3,  6 and  7 had  no  design  changes  (Ref.  Table  5.3  and  Fig.  5.7). 


hlRh  due  to  Inadequate  design  effort.  Only  points  5 and  6 would  not 

support  the  hypotheses  advanced  (or  be  explainable  as  responding 
a 

to  ECPs).  That  is,  only  these  points  do  not  support  the  hypotheses 
that  coding  and  testing  man-hours  (1)  Increase  if  BCFa  are  Introduced 
during  development,  (2)  increase  if  there  is  too  little  analysis  and  de- 
sign or  if  coding  is  initiated  before  the  completion  of  design,  or  (3) 
decrease  if  the  ptoject  is  "overdesigned." 

In  summary,  however,  we  must  allow  that  this  analysis  of  the 
hypotheses  using  the  small  data  set  has  not  produced  any  conclusive 
results.  A larger  data  set,  on  the  other  hand,  would  contribute  in  two 
ways  to  the  analysis.  First,  it  would  establish  ideal  resource  alloca- 
tion lines  more  accurately,  a key  to  the  trade-off  analysis.  Second, 
more  data  would  allow  stratification  into  different  populations,  so 
that  the  many  effects  that  are  operating  simultaneously  can  be  separated. 
Then,  within  these  strata,  the  trade-offs  should  become  clear. 

Even  so,  the  graphical  approach  is  of  limited  value,  leading  to 
explicit  hypothesis  formulation  as  opposed  to  the  establishment  of  an 
explicit  cost  estimating  relationship.  A more  quantitative  approach 
is  developed  in  the  next  section  which  demonstrates  the  potential  of 
analyzing  trade-offs  between  activities. 

5.3  QUANTITATIVE  TECHNIQUES  TO  ESTABLISH  TRADE-OFFS  BETWEEN  ACTIVITIES 

5.3.1  Establishing  the  Relationships 

To  demonstrate  this  technique  we  will  develop  a relationship 
between  the  resources  devoted  to  analysis  and  design  and  the  resources 


These  points  are  rather  anomalous  at  any  rate.  Point  5 is  PARMIS  Itself, 
which  apparently  was  done  on  a stringent  budget  by  experts.  Point  6, 
on  the  other  hand,  appears  to  have  had  generous  manpower  allocations 
throughout  the  life  cycle. 
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required  to  complete  the  programming  and  testing  activities.  The  working 
hypothesis  is  that  as  a system  is  more  completely  described  and  more  time 
is  available  to  work  out  details,  fewer  revisions  to  developed  code  should 
be  required.  Furthermore,  a complete  design  should  allow  the  programser  to 
proceed  more  quickly.  Personal  experience  has  indicated  that  vhen  very 
little  analysis  and  design  work  is  done,  the  programmer  spends  a signifi- 
cant part  of  his  time  completing  the  design  instead  of  actually  developing 
the  system  routines. 

Therefore  we  hypothesize  that,  as  the  amount  of  man-hours  devoted 
to  analysis  and  design  Increases,  the  amount  of  time  required  for  program- 
ming and  testing  decreases.  In  order  to  satisfy  the  condition  that  any 
valid  relationship  would  be  asymptotic  to  the  ordinate  and  abscissa  and 
to  keep  the  mathematics  simple,  only  relationships  of  the  form: 

MMpT  - (5.1) 

are  considered,  where  ^Mp^-  is  man-months  of  programming  and  testing  and 
MM^  is  man-months  of  analysis  and  design. 

The  trade-off  between  analysis  and  design  and  programming  and 
testing  is  only  really  valid  for  a given  program.  In  order  to  develop 
the  relationship  across  a number  of  programs,  descriptions  of  the  other 
program  differences  must  be  included  in  the  equation  form.  That  is, 

MMp^  - a(MM^)  **  X f (Program  Descriptors)  (5.2) 

Subsequent  studies  may  indicate  that  the  program  descriptors  must 
be  multi-dimensional  and  that  f is  a complicated  expression.  For  now 
we  will  use  lines  of  source  code  as  a descriptor  and  hypothesize  the 
relationship  to  be 

MMp^  - a(»«:^)“^(LS)® 
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(5.3) 


where  LS  > lines  of  source  code. 


With  s Isrge  date  base,  a regression  analysis  could  easily  be  run 

I 

to  establish  the  relationship.  However,  ve  are  working  here  with  only 
five  data  points  (points  5 and  6 from  Sec.  5.2  have  been  excluded  for  the 
reasons  cited  in  that  section) . A regression  analysis  on  so  few  points 
could  not  sort  out  the  effects  of  the  hypothesised  trade-off  from  those 
of  program  size,  LS  . Figure  5.9  is  a sketch  of  the  situation  we  are 
faced  with.  With  only  a few  data  points,  the  dependence  on  size,  shown 
dashed,  completely  obscures  the  trade-off  we  are  Interested  in  examining, 
shown  by  the  solid  curves. 


Figure  5.9.  Trade-Off  Curves  and  Size  Parameter  for  Small  Data  Base 
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For  this  illustrative  exercise,  therefore,  we  simply  postulate 
a plausible  value  for  the  exponent  b,  namely  b ■ 0.5  , and  then 
evaluate  the  remaining  parameters  a and  c by  regression.*  The  result 
of  the  regression  analysis  Is 

MMp^  - 15.3as)°*^^^(MM^)"°-5  (5.4) 

This  relationship  Is  graphed,  with  the  five  data  points,  in  Fig.  5.10. 

5.3.2  Optimum  Allocation  of  Resources 

The  existence  of  a trade-off  In  man-hours  between  two  activities  of 
the  development  cycle  Implies  the  existence  of  optimum  values  for  the 
man-hour  variables.  In  this  section  we  will  solve  for  those  optimum 
values . 

The  form  of  Eq.  5.4  is  such  that  as  analysis  and  design  man-hours 
Increase  for  a given  program  size,  coding  and  testing  man-hours  decrease. 
Tlie  total  man-hours  (and  hence  cost)  will  at  first  decrease  and  then 
Increase,  as  shown  In  Fig.  5.11.  We  want  to  locate  the  minimum  of  the 
total  cost  curve.  The  derivation  follows. 


As  a check  we  tried  b “ 0.2  and  b = 0.8  . Neither  made  much  change 
In  the  computed  value  of  c . Also,  If  we  define 

2 2 2 
R * 1 - (unexplained  variation)  /(total  variation) 

2 2 

then  Eq.  5.4  has  R ■ .824  . This  definition  of  R Is  calculated  on 
the  original  data  set  rather  than  the  form  of  the  equation  used  In  the 
regression  model.  As  such  It  can  be  directly  compared  to  other  model 
forms  and  avoids  the  problems  of  fit  indexes  (of  which  r2  is  one)  dis- 
cussed in  Ref.  24.  Although  a good  Is  not  sufficient  to  prove  the 
model.  It  does  show  that  the  data  Is  consistent  with  the  model. 


0 to  20  30  40  so 
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Figure  5.10.  Relationship  Between  Analysis  and  Design  Euid  Prograsnlng  and 
Testing,  Using  Five  Data  Points 


Let  the  cost  of  analysis  and  design  be  given  by 

"ad  - 

t 

where  - cost  per  man-hour  of  analysis  and  design.  Similarly,  let 

the  cost  of  programming  and  testing  be  given  by 

CpT  “ Cp^  MMp^ 

Now  total  cost  is  given  by 


or,  substituting  the  previous  relations  and  the  trade-off  equation  (Eq.  5.4) 

"“ad  ‘as)V.^ 

Now  the  optimum  Is  found  by  differentiating  total  cost  with  respect  to 
man-months  of  analysis  and  setting  the  result  equal  to  zero: 


ac 


aMM 


- C 


c 1 


AD 


AO 


- a Cp^(LS)'-  f (MM^) 


,-3/2 


0 


or 


MM 


AD 


- 0.63 


optimum 


PT 


AD 


2/3 


(LS) 


2c/3 


Thus  the  optimum  number  of  analysis  and  design  man-months  has  been 
found  as  a function  of  size  (In  thousands  of  lines  of  source  code)  and 
the  ratio  of  programming  and  testing  unit  cost  to  analysis  and  design  unit 
cost . 
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Values  of  $7,020  and  $6,560  per  man-month  were  computed  for 

• 19* 

and  from  cost  data  reported  by  Wolverton.  These  values  give 


MM 


AD 


0.66  a 


2/3 


(LS) 


2c/3 


optimum 


or,  with  the  values  of  a and  c previously  determined  (Eq.  5.4), 


MM 


AD 


optimum 


4.07  (LS)°'^ 


The  corresponding  optimum  time  for  programming  and  testing  can 
now  be  calculated  from  Eq.  5.4: 


MMpT 


optimum 


aas)*"/  . 

AD  ^ . 

I optimum 


* 1.23 


2/3 

a 


(LS) 


2c/3 


or,  substituting  the  values  of  a and  c : 


MMpT 


optimum 


7.58  (LS)°‘^ 


The  expressions  give  the  optimum  value  for  the  investment  consider- 
ing only  one  degree  of  freedom:  the  decrease  in  programming  and  testing 
cost  for  increasing  analysis  and  design  cost.  With  a more  complete 
description  of  the  relations  among  activities,  the  optimization  would 
have  to  consider  the  effects  of  increasing  the  investment  in  analysis 
and  design  not  only  on  programming  and  testing  costs,  but  on  such 


A 

These  values  Include  both  personnel  and  computer  costs. 
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life-cycle  parameters  as  total  coat,  cost  of  errors,  and  effects  on  the 
development  schedule.  The  inclusion  of  these  factors  in  future  studies 
offers  the  promise  of  giving  program  managers  quantitative  measures 
for  making  decisions  during  software  development. 

It  Is  useful  to  compare  the  predicted  man-months  for  the  optimum 

22 

with  actual  relationships  derived  In  a prior  study.  Figures  5.12  and 
5.13  compare  the  predictions  with  data  on  some  large  software  developments 
compiled  for  NASA. 

The  expression  for  the  optimum  relationships  derived  In  this  section 
has  markedly  different  results  from  the  NASA  data,  particularly  for 
programs  larger  than  about  25,000  lines  of  source  code.  The  dependence 
on  LS  — a square  root  relationship — Implies  economies  of  scale,  which 

have  not  been  evident  In  the  historical  data.  This  Is  probably  a 
direct  consequence  of  the  sample  size  and  the  range  of  the  data  within 
the  sample  (maximum  equals  36,000  new  lines  of  source  code).  However,  if 
the  square  root  rule  holds  with  a larger  data  base,  then  the  message  Is 
quite  different.  Historically,  we  have  not  been  achieving  the  economies 
Implied,  due  perhaps  to  poor  allocation  of  resources. 

5.3.3  Comparison  of  the  Optimum  Hours  with  the  40-20-40  Rule 

There  are  many  references  in  the  literature  to  the  distribution  of 

resources  among  design  (and  analysis),  coding,  and  testing.  Although 

22 

there  Is  considerable  variation  among  the  published  distributions,  many 
writers  cite  an  average  distribution  of  resources  of  40%  analysis  and 
design,  20%  coding,  and  40%  testing.  It  Is  Interesting  to  develop  a 
comparable  distribution  using  the  optimum  values  developed  in  the 
preceding  sections. 

The  ratio  of  optimum  coding  and  testing  time  to  optimum  design 
and  analysis  time  can  be  calculated.  Surprisingly,  the  answer  Is 
Independent  of  size  (lines  of  source  code): 
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ANALYSIS  AND  DESIGN  TIME,  man-months 


1000  LINES  OF  SOURCE  CODE 


Figure  5.12.  Cotaparieon  of  "Optimum"  Analysis  and  Design  Time  with 
Project  Data 
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With  the  value  of  b ■ 0.5  we  chose,  and  the  values  of  Cp^  and 
specified  earlier,  this  equation  gives  the  optlnum  percentage  of  analysis 
and  design  as  35Z. 


This  result  suggests  that  the  trade-off  equation  must  be  of  a 
richer  form  for  a trade-off  to  occur  between  these  two  phases,  given  an 
optimal  allocation.  Alternatively,  one  might  suggest  that  the  optimal 
allocation  really  is  constant,  as  the  40-20-40  rule  suggests.  Also,  the 

relationships  derived  from  the  NASA  data  predict  that  design  man-hours 
vary  between  41  and  42  percent. 

To  examine  the  relationship  between  coding  and  testing,  a regression 
was  fitted  to  the  AFDSDC  data,  with  the  following  result: 

- 1.14  - 0.016  (LS) 

This  relationship  Is  plotted  In  Pig.  5.14.  Using  35Z  for  analysis  and 
design,  this  equation  gives  the  following  percentage  allocations: 


Lines  of 

Source  Code 

Analysis  and 
Design 

Coding 

Testing 

5,000 

35 

32 

33 

10,000 

35 

33 

32 

20,000 

35 

36 

29 

35,000 

35 

41 

24 

50,000 

35 

48 

17 

The  division  between  coding  and  testing  resources  changes  significantly 
over  the  range.  Not  only  do  none  of  the  results  match  the  40-20-40  rule 
well,  but  there  Is  a significant  change  In  the  distribution  between  coding 
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Figure  5.14  Testing  and  Coding  Relationship 


and  testing  hours  as  program  size  Increases.  The  results  Indicate  that 
the  emphasis  on  testing  becomes  less  as  program  size  increases. 

Again  analyzing  the  trendline  data  from  our  previous  NASA  study, 
we  see  an  inverse  relationship  between  man-months  of  coding  and  testing. 
In  particular. 


W 


*°*T  _ 1.25 
0.373 


3.35  (LS) 


-0.196 


The  direction  of  this  relationship  Is  the  same  as  before;  but  over  the 
range  of  Interest,  coding  varies  between  17  and  23  percent,  and  testing 
between  42  and  35  percent.  Thus,  the  percentage  allocation  Is  quite 
different,  and  much  more  In  line  with  the  40-20-40  rule. 
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The  question  of  why  there  Is  such  a difference  In  the  distribution 
of  coding  and  testing  hours  Is  an  Important  one  to  address  at  this  point. 

It  Is  possible  that  the  use  of  a separate  organization  at  the 
Design  Center  to  verify  and  validate  software  has  the  effect  of  decreasing 
the  relative  time  for  testing.  Since  each  programmer  knows  that  his 
system  will  be  subjected  to  a thorough  and  Independent  test,  he  Is  more 
likely  to  spend  extra  time  to  make  sure  that  It  works  well  before  he 
releases  It.  Also,  since  a different  group  has  control  of  the  programs 
after  testing  begins,  there  Is  no  opportunity  for  a programmer  to  leave 
some  Items  undone  with  the  Idea  that  they  will  be  completed  along  with 
any  corrections  to  the  code  made  during  the  testing  period. 

This  explanation  Is  attractive  because  It  supports  a long-held 
suspicion  that  the  testing  phase  is  overstated  In  the  40-20-40  rule. 

More  Importantly,  it  Is  possible  that  the  Design  Center's  distribution 
of  effort  between  programming  and  testing  represents  a truer  description 
of  the  relative  efforts.  Hours  are  treated  separately,  and  there  Is  no 
pressure  to  declare  the  program  to  be  in  Internal  Integration  and 
testing  at  one  point  In  time.  Thus  testing  of  already  programmed  code 
and  orogrammlng  of  new  code  can  go  on  concurrently.  In  effect,  the 
"moving  milestone"  of  Fig.  2.3  has  been  captured  In  the  PARMIS  data  base. 

On  the  other  hand,  data  sources  using  a fixed  milestone  approach 
may  well  allow  man-hours  to  be  recorded  against  testing  once  some  portion  1 

of  the  code  Is  In  testing.  If  this  Is  the  case,  then  Initial  coding  | 

hours  are  being  reported  against  testing,  a practice  which  certainly 
distorts  the  data.  The  40-20-40  rule  may  be  a consequence  of  this 
reporting  distortion. 

Of  course,  another  possibility  Is  that  the  PARMIS  data  base  Is 
just  too  small  to  make  these  particular  estimates.  Only  a larger  data 
base  will  help  alleviate  that  particular  concern. 
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5.4  CONCLUSIONS 

Studying  the  relations  among  project  phases  has  made  It  possible 
to  develop  the  form  of  relationships  that  would  be  useful  for  software 
project  management.  Results  are  only  illustrative,  due  to  the  small  data 
base,  but  the  technique  has  been  demonstrated  and  the  potential  Is 
promising.  Using  these  types  of  relationships  for  the  optimum  division 
of  resources  between  analysis  and  design  and  programming  and  testing.  It 
should  be  possible  to  lower  development  costs  and  to  have  a better  basis 
for  evaluating  project  proposals  and  for  making  decisions  during  project 
Implementation.  Extension  of  the  method  to  Include  other  software 
dimensions  such  as  reliability  should  make  the  study  of  relationships 
even  more  valuable. 

The  results  derived  from  the  Data  Systems  Design  Center  suggest 
that  there  Is  reason  to  question  the  40-20-40  rule  of  thumb  for  distri- 
bution of  resources.  This  finding  may  have  significant  Impact  on  the 

future  planning  of  software  development  projects.  Furthermore,  the 
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findings  support  previous  results  that  Indicate  that  the  relative 
emphasis  on  testing  decreases  as  project  size  Increases.  Realizing 
this  expectation  could  lead  to  sizable  savings. 
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6 QUANTITATIVE  RESULTS 

6.1  INTRODUCTION 

In  this  section  we  report  on  the  additional  analyses  that  were  made, 
using  data  bases  other  than  PARMIS. 

We  were  able  to  draw  on  some  data  from  software  contractors  and  to 

obtain  some  high-quality  Information  from  one  Program  Office.  Our  principal 

sources  of  data  were  the  ADP  Resource  Estimating  Procedures  (ADPREP) 

12 

study,  the  SAMSO  Program  Office  that  is  responsible  for  the  development 
of  general-purpose  support  software  for  the  Satellite  Control  Facility, 
some  data  from  another  GRC  software  project,  and  some  published  data 
dealing  with  software  reliability.  These  sources  of  data  are  described 
in  detail  in  Sec.  6.2.  Sections  6.3  to  6.5  discuss  how  the  data  were 
used  to  test  some  of  the  hypotheses  discussed  in  Sec.  3,  and  the  results 
of  these  tasks.  The  Importance  of  some  explanatory  variables  is  covered 
in  Sec.  6.3;  relationships  between  alternative  measures  of  software  products 
in  Sec.  6.4;  and  characteristics  of  the  Maintenance  activity  in  Sec.  6.5. 

6 . 2 DATA  SOURCES 

6.2.1  ADPREP 

ADPREP  was  a study  performed  by  Planning  Research  Corporation  for 
the  Army  in  1975.  The  study  reported  on  38  data  processing  systems:  18 
developed  by  the  Air  Force  and  20  by  the  Army.  These  systems  included 
both  business-oriented  (ADP)  and  other  programs,  although  ADP  programs 
dominate. 


fi 


The  ADPREP  study  included  Interviews  and  reviews  of  project  data 
to  produce  summaries  of  each  project's  history  and  cost.  The  data  re- 
ported addresses  development,  operation,  and  maintenance  of  the  software 
systems.  Some  of  the  types  of  information  reported  are:  (1)  maintenance 
data,  including  staffing,  computer  usage,  and  types  of  Improvemehts  made; 
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(2)  program  size,  including  counts  of  both  source  and  object  instructions; 

(3)  the  number  of  changes  in  requirements  during  development;  (4)  the 
project  schedule;  (5)  the  staffing  of  the  development  phase,  broken  down 
by  types  of  personnel;  (6)  effort,  also  broken  down  by  types  of  personnel; 
and  (7)  computer  usage,  broken  down  by  major  project  phase  (development, 
operation,  and  maintenance). 

The  major  limitation  of  the  ADPREP  data  is  that  effort  and  computer 
usage  are  aggregated  over  the  whole  project  and  not  broken  down  by 
development  phase  (analysis,  design,  coding,  test),  and  therefore  cannot 
be  used  in  any  but  the  most  aggregated  of  analyses. 

6.2.2  APES  Data 

The  Advanced  Orbital  Ephemerls  Subsystem  (APES)  is  a large  ground - 
3 

based  C I system  that  is  used  to  support  satellite  systems.  The  APES 
data  describes  the  maintenance  and  operation  of  more  than  400  programs 
written  and  maintained  by  two  contractors.  The  types  of  programs  include 
compilers,  operating  systems,  data  reduction  and  presentation  utilities, 
and  orbit  planning  and  analysis  utilities. 

The  APES  is  maintained  by  two  associate  contractors  and  an  integration 
contractor.  The  associate  contractors  are  responsible  for  the  maintenance 
and  improvement  of  separate  portions  of  the  software;  roughly,  one  portion 
is  the  operating  system  and  the  other  is  the  applications  programs  used 
to  maintain  satellite  ephemerides.  The  role  of  the  integration  contractor 
Is  to  review  the  products  of  the  associate  contractors  and  to  perform  the 
system-level  testing.  A very  important  additional  function  of  the  Integration 
contractor  is  to  maintain  configuration  control  over  the  versions  or  models 
of  the  software  that  are  released  for  operational  use  by  the  Air  Force. 

To  achieve  configuration  control,  the  integration  contractor  main- 
tains a data  base  of  the  characteristics  of  the  various  software  packages 
of  the  system  (program  components)  and  records  describing  their  modifications. 
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Unfortunately,  that  data  cannot  be  correlated  with  the  costs  of  the 
component  activities.  However,  it  was  possible  to  use  the  data  to 
investigate  the  properties  of  the  operations  and  support  phase  of  the 
software  life  cycle. 

The  data  we  collected  describes  four  revisions  or  "releases"  of 
the  system,  and  the  maintenance  histories  of  those  revisions  once  they 
became  operational.  For  each  program  of  each  release  of  the  system,  we 
collected  (1)  the  delivered  size,  (2)  the  number  of  problems  reported 
and  resolved,  (3)  the  size  of  the  Part  II  Specifications,  (4)  the  design 
changes  incorporated  with  that  release,  and  (5)  the  system  integration 
tests  of  the  design  changes.  In  addition,  we  collected  data  on  the  man- 
months  of  development  and  the  man-months  of  maintenance  effort,  and  the 
schedules  for  developing,  testing,  and  operating  the  system  releases. 

The  size  of  the  software  was  measured  in  terms  of  twelve  charac- 
teristics, of  which  we  found  lines  of  source  code  and  lines  of  object 
code  to  be  most  useful.  The  number  of  design  changes  was  determined  by 
the  number  of  Design  Change  Requests  approved  for  implementation  in  the 
release  being  considered.  The  number  of  problems  corrected  by  the  release 
was  determined  by  counting  the  number  of  Discrepancy  Report  Forms  (DRFs) 
closed  by  the  start  of  integration  testing.  The  number  of  problems 
reported  with  the  integration  testing  and  use  of  the  release  was  obtained 
by  counting  the  number  of  DRFs  opened  (even  if  subsequently  closed) 
after  the  start  of  integration  testing. 

The  applications  and  analyses  of  the  data  were  limited  by  the 
difficulty  of  relating  the  effort  expended  by  contractors  to  the  changes 
in  the  software.  Specifically,  only  the  final  delivered  sizes  of  programs 
were  reported;  how  much  of  the  programs  was  new  in  each  release  was 
unknown.  Maintenance  activities  of  the  contractors  began  with  the  start 
of  system-level  tests  rather  than  operational  use  of  the  system,  and 
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were  reported  In  aggregate  form.  More  importantly,  revision  and  main- 
tenance often  took  place  concurrently  and  were  not  reported  separately. 


6.2.3  GRC  Internal  Data 

This  collection  of  data  was  published  in  an  earlier  GRC  study  for 
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NASA.  It  consists  of  information  obtained  from  the  open  literature, 
reports  on  and  interviews  with  specific  NASA  projects,  and  data  provided 
by  Boeing  Aerospace  Company.  The  data  generally  reports  on  the  total 
dollars  or  effort  required  for  a software  development  project,  whose 
size  is  reported  in  object-language  instructions.  The  dollar  costs 
were  converted  to  effort  by  first  Inflating  the  dollar  values  to  a 
common  year  (1974)  and  then  converting  to  man-years  with  a factor  of 
$50,000  per  man-year. 


The  data  collected  also  reported  the  allocation  of  total  costs 
to  three  development  phases:  (1)  design,  (2)  coding,  and  (3)  test. 

These  phases  are  not  identical  with  those  defined  In  Sec.  2.  For  example, 
"design"  Includes  what  we  have  called  "analysis". 


6.2.4  Consistency  of  the  Data  Sources 

Before  using  this  data  to  test  hypotheses,  we  will  examine  all 
of  the  data  used  in  this  study  and  compare  the  ADP  applications  with  the 
aerospace  applications. 


Of  the  data  we  have  used,  some  describes  software  written  for 
aerospace  defense  applications,  and  some  describes  business  or  "ADP" 
systems.  One  of  the  first  considerations  in  using  this  mixture  of 

data  is  whether  it  is  reasonable  to  use  data  on  ADP  systems  to  study  and 

* 

predict  the  costs  of  defense  systems.  Costs  in  some  phases  of  a software 


This  is  especially  Important  with  the  reliance  on  PARMIS  data  in  the 
previous  section. 


development  obviously  depend  upon  the  application.  This  dependence  was 
noted  throughout  Sec.  3,  and  was  acconraodated  by  stratifying  the  cost 
relationships  by  software  type. 

The  ADP  and  aerospace  software  systems  considered  in  this  report 
have  similar  life  cycles  that  break  down  into  the  phases  discussed  in 
Sec.  2.  We  believe  that  the  general  forms  of  the  relationships  describing 
the  costs  of  individual  phases  are  Independent  of  the  application,  but 
that  the  values  of  the  coefficients  used  in  the  relationships  depend  on 
application. 

Figure  6.1  Illustrates  how  the  various  data  sources  compare  with 
one  another  in  terms  of  size  and  development  effort.  The  defense  (aero- 
space) programs  examined  in  the  NASA  study  cited  in  Sec.  6.2.3  are  plotted 
as  dots;  the  ADPREP  data,  on  a mixture  of  "scientific"  and  "commercial" 
programs,  are  plotted  as  squares.  The  PARMIS  data  are  plotted  as  triangles. 
The  same  general  trends  are  evident  for  each  of  the  three  groups  of  data: 
development  effort  Increases  something  like  linearly  with  computer  program 
size,  but  the  variance  is  extremely  large. 

Generally,  the  defense-system  programs  are  more  costly.  However, 
both  "commercial"  and  defense-system  programs  exhibit  similar  trends, 
with  somewhat  more  dispersion  in  the  "commercial"  data.  We  believe  that 
the  programming  processes  used  to  develop  "commercial"  and  defense-system 
programs  are  similar,  and  that  the  higher  cost  of  defense-system  programs 
can  be  explained  by  their  complexity;  this  is  reflected  in  coefficients 
of  cost  estimating  relationships.  We  believe  that  the  functional  relation- 
ships between  the  components  of  the  development  activity  and  the  variables 
that  explain  effort  are  similar  for  either  type  of  system.  Thus,  valuable 
insights  can  be  obtained  from  the  study  of  "commercial"  or  "ADP"  develop- 
ments. They  also  offer  the  advantage  of  clearly  distinguishing  the  costs 
of  software  from  other  system  costs. 
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Several  comments  need  to  be  made  about  the  quality  and  nature  of 
these  data  collections.  These  observations  tend  to  lessen  the  apparent 
differences  between  the  unit  costs  of  the  defense  system  or  "scientific" 
data  points  and  those  of  the  "business"  or  "ADP"  data  points  plotted  in 
Fig.  6.1.  The  PARMIS  and  ADPREP  data  points  represent  software,  mostly 
of  the  "ADP"  type,  that  was  generally  developed  by  the  Army  or  the  Air 
Force.  Almost  all  of  the  defense-system  data  points  represent  software 
developed  by  private  companies.  It  is  not  clear  that  the  services  report 
clerical  and  support  man-hours  against  the  development  in  the  same  manner 
that  a private  contractor  would.  If  they  do  not,  the  ADP  (Government) 
applications  should  be  more  expensive  than  they  were  reported  to  be. 

Secondly,  some  of  the  defense-system  data  points  were  derived  from 
total  dollar  costs  inflated  to  1974  dollars,  assuming  a $50,000  man-year. 
The  composition  of  the  dollars  reported  is  not  always  known.  Some  reports 
may  contain  computer  costs  or  hardware  Integration  and  test  costs  which 
would  tend  to  exaggerate  unit  costs.  Thus,  there  is  some  possibility  that 
the  trend  line  for  the  defense-system  data  should  be  moved  closer  to  the 
ADP  trend.  These  conjectures  have  not  been  confirmed,  as  all  the  details 
of  the  data  reported  were  not  available. 

The  data  described  in  the  preceding  paragraphs  were  used  to  obtain 
results  that  demonstrate  some  of  the  important  explanatory  variables  of 
software  cost  (Sec.  6.3),  relationships  between  alternative  measures  of 
the  software  product  (Sec.  6.4),  and  characteristics  of  the  maintenance 
activity  of  software  (Sec.  6.5). 

6.3  TESTING  EXPLANATORY  VARIABLES  FOR  SOFTWARE  COSTS 

The  bulk  of  this  data  is  not  detailed  enough  to  test  hypotheses 
that  relate  explanatory  variables  to  the  resources  consumed  by  the  speci- 
fic phases  of  a software  development,  or  to  their  relationships,  which 
we  believe  to  be  important  in  explaining  variations  in  the  life-cycle 
costs  of  systems. 
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The  ADPREP,  AOES,  PARMIS,  and  GRC  data  were  used  to  determine  the 
significance  of  five  effects  that  were  hypothesized  (in  Sec.  3)  to  be 
Important  in  explaining  the  development  and  maintenance  costs  of  software: 
(1)  size  of  the  product,  (2)  the  type  of  problem  being  solved,  (3)  the 
programming  language  used  in  the  software  development,  (4)  constraints 
imposed  by  limited  hardware  resources,  and  (5)  changes  in  requirements 
during  the  development  of  the  software. 

6.3.1  Product  Size 

Throughout  Sec.  3,  the  hypothesized  relationships  defining  software 
life-cycle  costs  employ  an  independent  variable  that  is  related  to  the 
size  of  the  delivered  software.  In  most  cases,  the  estimate  of  size  is 
taken  as  object-code  instructions,  so  that  any  effect  of  programming 
language  is  minimized.  It  is  therefore  important  that  the  data  available 
to  this  study  demonstrate  a clear  relationship  between  the  size  of  the 
software  and  the  effort  required  to  develop  the  software.  Since  man-hours 
were,  in  general,  not  broken  down  to  the  phase  level,  this  aspect  of  the 
study  considered  total  development  effort  as  a function  of  delivered 
software  size  measured  in  object  instructions  for  most  of  the  data  points 
to  be  used.  These  data  Include  ADPREP,  GRC,  and  PARMIS.  The  AOES  data 
was  omitted,  since  the  size  of  the  delivered  software  could  not  be  quanti- 
fied and  related  to  effort. 

As  Fig.  6.1  Indicates,  there  is  an  apparent  relationship  between 
size  of  the  software  and  the  effort  required  to  develop  that  software. 
However,  the  data  varies  so  much  that  a relationship  based  solely  on  such 
a trend  is  not  useful  as  a cost  estimator.  As  Sec.  4 Indicated,  there 
is  evidence  to  support  the  theory  that  part  of  the  variation  is  explained 
by  trade-offs  between  the  phases  required  to  produce  the  software.  In 
addition,  it  is  believed  that  there  are  other  variables  that  will  explain 
the  differences  between  the  samples.  The  first  of  these  is  the  type  of 
problem.  The  data  available  Is  meager,  and  thus  it  is  not  possible  to 
define  the  types  of  software  systems  that  are  in  the  sample  space. 
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However,  a gross  categorization  of  the  sample  space  is  possible:  defense 
system  and  "commercial."  Analysis  of  the  data  shows  that  the  trend  relating 
size  to  effort  suggests  that  defense  system  programs  are  produced  at  the 
average  rate  of  10  man-months  per  1000  object  code  instructions  in  the 
final  program  (approximately  $50  per  instruction).  The  trend  for  the 
"commercial"  programs  indicates  that  they  are  produced  at  an  average  rate 
of  3 man-months  per  1000  instructions  (approximately  $17  per  instruction) . 
Roughly,  defense  system  programs  are  three  times  as  expensive  to  develop. 

It  is  important  that  the  dollars-per- Instruction  figures  not  be 
taken  literally.  As  noted  earlier,  there  may  be  Important  differences 
in  what  is  reported  as  development  effort  that  would  invalidate  these 
figures.  The  data  are  presented  only  to  show  that  a relationship  between 
size  and  effort  exists,  and  that  the  values  are  consistent  with  known 
rules  of  thumb. 

6.3.2  Programming  Language 

Section  3 discussed  the  role  of  programming  language  extensively. 

It  was  asserted  that  programming  language  should  have  little,  if  any, 
effect  on  the  cost  of  phases  such  as  design  and  analysis,  since  their 
products  are  independent  of  the  language.  However,  programming  language 
should  have  a dramatic  effect  on  the  effort  required  to  code  and  test 
software.  One  property  of  high-level  languages  is  a large  "code  expansion 
factor":  one  statement  in  the  high-level  language  expands  into  many 
object-language  instructions.  This  would  suggest  that  a high-level 
language  is  more  labor-efficient  during  the  coding  activities.  In 
addition,  programs  written  in  high-level  languages  should  be  easier  to 
test  and  maintain,  since  they  express  the  design  implementation  in  terms 
more  compatible  with  the  actual  problem  being  solved.  (For  instance. 


3 

Defense  system  applications  are  C or  avionics  (real  time),  while 
"commercial"  applications  are  "ADP"  or  "business"  programs  that  run 
in  either  batch  or  Interactive  modes. 
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the  concept  of  an  array  is  used  to  represent  a vector  in  a high-level 
language  as  compared  with  the  mere  consecutive  words  of  memory  used  in  an 
assembly  language.) 

Again,  sufficiently  detailed  data  was  not  available  to  determine 

the  validity  of  the  hypothesized  relationships  discussed  in  Sec.  3.  Since 

progranmlng  language  permeates  the  entire  discussion  of  Sec.  3,  it  seems 

reasonable  to  test  for  the  significance  of  language  if  only  in  a limited 

form.  The  ADPREP  data  contains  Information  about  the  programming  language 

employed.  The  data  is  not  broken  down  to  the  development  phase  level. 

It  was  decided  that  regression  analysis  could  be  used  to  test  for  the 

significance  of  this  variable,  using  total  development  costs.  What  results 

is  not  a useful  estimating  relationship  for  total  development  effort, 

2 

since  the  R values  are  small;  the  T statistic  for  the  Independent 
variable  (programming  language)  can  provide  some  insight  about  the  Impor- 

it 

tance  of  programming  language  in  explaining  variations. 

The  ADPREP  data  was  analyzed  from  several  points  of  view.  The  first 

analytic  results  (Fig.  6.2)  show  the  extent  to  which  variations  in  total 

development  effort  are  explained  by  the  programming  language  employed. 

25 

The  figure  is  the  result  of  using  the  program  BMDSP.  The  regression 
analysis  was  performed  on  the  logarithms  of  the  variables  so  that  a 
relationship  of  the  form 

A 

In  this  section  we  will  be  using  the  typical  statistical  measures 
associated  with  regression  analysis  to  define  the  quality  of  the  rela- 
tionships. r2  measures  the  goodness  of  fit,  with  values  close  to  1 
showing  a good  relationship.  F and  T statistics  measure  the  signi- 
ficance of  the  model  in  explaining  the  data;  F compares  the  entire 
model  to  a model  based  on  the  data  average,  while  T statistics  measure 
the  significance  of  each  individual  parameter.  Significant  values 
for  F and  T depend  on  the  size  of  the  sample,  but  in  general,  if  F is 
greater  than  6 and  T greater  than  2,  you  have  a significant  result. 

For  further  information  consult  Lindgren,  Statistical  Theory. 

MacMillan,  1962. 
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could  be  determined.  The  R value  Is  that  obtained  for  an  equation  of 
the  form 


where 


- a + b An  (ob j . Inst.)  + c (language) 


language 


0 assembly  language 

1 high-level  language 


The  graph  shows  the  natural  log  of  the  delivered  object  instructions  on 
the  horizontal  axis  and  the  natural  log  of  the  total  man-months  of 
development  effort  on  the  vertical  axis.  The  actual  ADPREP  observations 
are  plotted  as  the  symbol  "0".  The  predicted  values  are  shown  by  the 
lines. 


The  regression  analysis  shows  the  following  coefficients  for  the 
above  equation: 


Coefficients  T Statistics 


a - -3.1588 

b = 0.7850  b:  4.483 

c » -0.6090  c:  -1.408 


Other  Statistics 

R^  =■  0.3858 

F « 10.0500 


With  the  equation  converted  to  the  form 

"«d...lop  ■ “’'-‘J- 

the  coefficients  become 

a*  - 0.04 
b - 0.79 
c'  - 0.54 
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It  should  be  noted  that  the  T and  R statistics  apply  only  to  the  natural 
logarithms  used  In  the  regression  and  not  to  these  modified  coefficients. 
Also  while  b Is  significant,  c Is  only  marginally  so  (confidence  level 
80%),  so  the  relationship  Is  not  strong. 

The  next  two  Illustrations  show  the  same  hypothesis  examined  In 
terms  of  the  effort  In  particular  skill  categories.  These  results  were 
obtained  with  a subset  of  the  ADPREP  data  that  reported  effort  by  analysts 
and  programmers.  We  first  consider  the  analysts.  The  analyst  makes  a 
primary  contribution  during  analysis  and  design  activities  of  a project, 
although  In  many  of  the  data  points  of  the  AOPREP  sample,  analysts  were 
employed  throughout  the  lifetime  of  the  project  to  code  and  test  the 
product.  Thus,  It  Is  not  always  possible  to  associate  the  analyst's 
effort  with  a unique  subset  of  the  life-cycle  activities  of  the  ADPREP 

j data.  The  equation  used  in  the  regression  analysis  shown  In  Fig.  6.3  Is 

) 

« a + b In  (ob  j . Inst.)  + c (language) 

The  statistics  of  the  regression  are 

Coefficients  T Statistics 
a - -1.7246 

b - 0.550  b:  1.472 

c - -1.407  c : -2.309 

Figure  6.4  shows  the  same  analysis  of  the  man-hours  reported  for 
programmers.  Again,  It  Is  not  possible  to  attribute  programmer  activities 
to  a unique  subset  of  the  life-cycle  activities.  The  equation  used  in  the 
regression  analysis  shown  In  Fig.  6.4  is 

ln(MM  ) ■ a -I-  b In  (obj.  Inst.)  + c (language) 

programmers  = = 

The  statistics  of  the  regression  are 


Other  Statistics 

R^  - 0.3142 
F - 2.749 


2 


LN(PROGiWWER  M/M) 


Coefficients 


T Statistics 


Other  Statistics 


a = -2.7364  « 0.4110 

b = 0.665  b:  2.008  F - 4.187 

c - -1.509  c;  -2.793 

Each  of  these  equations  can  also  be  expressed  in  "exponentiated"  form. 

The  reader  is  again  cautioned  that  the  statistics  reported  above  do  not 
apply  to  these  modified  equations. 

The  results  do  indicate  that  programming  language  is  an  important 

factor  in  both  of  these  relationships.  The  results  are  not  useful  as 

2 

an  estimation  technique  because  of  the  low  R values.  The  F statistic 
is  also  very  marginal,  although  significant  at  the  0.95  confidence  level. 
The  T statistic  associated  with  the  programming-language  variable  is  also 
significant  for  these  two  relationships,  although  the  coefficient  on 
object  instructions  becomes  marginal  in  the  first  equation  (analyst). 
Hence  the  relationships  are  weak,  but  indicate  some  importance  for  the 
programming-language  variable.  The  fact  that  this  variable  does  not  show 
significance  for  programmers  and  insignificance  for  analysts  is  viewed 
as  due  to  lack  of  precision  in  defining  the  functions  carried  out  by  the 
individuals  and  not  a refutation  of  the  hypothesis. 

It  is  interesting  to  note  that  the  constants  of  proportionality 
(e  related  to  language  vary  between  about  2 and  5.  That  is,  assembly- 
language  programs  are  2 to  5 times  more  expensive,  per  object  instruc- 
tion generated,  to  develop.  This  result  agrees  with  the  results  reported 
in  controlled  experiments  with  language,  and  with  informal  rules-of- 
Chumb  conveyed  in  private  conversations  (Sec.  7). 

6.3.3  Resource  Constraints 

The  discussion  of  Sec.  3 argued  that  hardware  constraints  should 
affect  the  cost  of  developing  software.  This  was  reflected  in  the  hypo- 
thesized estimating  relationships  for  the  coding  and  testing  phases. 
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This  effect  has  been  previously  observed  and  documented,  and  has  been 
Incorporated  in  the  results  of  many  researchers.  The  most  general  form 
of  this  relationship  shows  development  costs  growing  exponentially  with 
Increasing  utilization  of  the  hardware  Our  examination  of  the  literature 
showed  that  there  were  no  attempts  to  quantify  this  relationship,  so  the 
hypotheses  of  Sec.  3 have  simplified  the  relationship  to  a single  factor 
representing  the  presence  or  absence  of  such  a constraint,  the  constraint 
being  assumed  to  be  present  If  more  than  95  percent  of  the  computer's 
memory  was  being  utilized. 

No  specific  data  on  the  costs  of  Individual  phases  (as  defined 
by  the  process  model  of  Sec.  2)  were  available  to  test  the  specific 

22 

hypotheses  of  Sec.  3.  However,  some  data  from  an  earlier  GRC  study 
had  been  applied  to  estimate  the  Importance  of  resource  constraints.  As 
discussed  in  Sec.  6.2.3,  these  earlier  data  were  allocated  to  three 
phases  (design,  coding,  and  test).*  This  earlier  study  examined  the 
effort  required  In  each  phase  as  a function  of  resource  constraints. 

The  results  obtained  for  each  phase  are  shown  in  Table  6.1  and 
Illustrated  In  Figs.  6.5  to  6.7.  In  each  case,  effort  is  approximately 

linear  with  size,  and  hardware  constraints,  when  present.  Increase  costs 

** 

by  a factor  of  5 or  more.  It  Is  also  Interesting  to  note  that  the  ratio 
of  testing  effort  to  coding  effort  gets  smaller  with  Increasing  size. 

The  results  differ  somewhat  from  the  hypotheses  presented  In  Sec. 

3.  In  that  section.  It  was  claimed  that  the  hardware  constraint  would 
affect  only  the  coding  and  testing  costs.  These  results  Indicate  that 
the  constraint  also  drives  design  costs.  It  Is  recommended  that  this 
additional  hypothesis  not  be  added  until  the  exact  accounting  principles 


Not  the  same  phases  as  defined  in  Sec.  2. 

The  factor  of  5 is  observed  when  converting  the  emations  in  Table  6.1 
to  a linear  form;  e.g.,  in  the  last  equation  ■ 5.16. 
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TABLE  6.1 


HARDWARE-CONSTRAINT  EQUATIONS 

in  (M-Y  design)  = -1.17  + 1.03  [in(obj.  inst./lOOO)]  + 1.730  X 

in  (M-Y  coding)  = -2.3  + 1.13  [in(obj.  insi./lOOO)]  + 2.120  X 

in  (M-Y  testing)  = -0.86  + 0.934[in(obj . inst./lOOO)]  + 1.640  X 

where 

0 if  no  hardware  constraint 

1 if  more  than  95%  of  memory  is  utilized 


used  to  determine  costs  in  that  data  base  are  understood  and  related 
to  the  phases  of  the  process  model  used  in  this  report. 

The  results  Indicate  that  the  presence  or  absence  of  the  hardware 
constraint  is  important  in  explaining  variations  in  development  costs. 

Its  effect  is  clearly  secondary  to  that  of  code  size.  The  coefficients 

of  X that  prescribe  the  effect  of  the  constraint  on  costs  differ  considerably 

* 

from  results  obtained  elsewhere,  which  indicate  that  constrained  software 
is  approximately  twice  as  expensive,  rather  than  five  times,  as  indicated 
in  Table  6.1.  Since  only  three  of  the  data  points  considered  in  these 
relationships  represent  constrained  developments,  differences  are  not 
surprising,  and  one  should  not  take  the  actual  factor  of  five  literally. 

■k 

Doty  Associates  claimed  a significantly  smaller  impact  at  the  second 
Technical  Direction  Meeting  for  this  project. 


MAN-YEARS  OF  DESI 


NUMBER  OF  OBJECT  INSTRUCTIONS,  thousands 


Figure  6.5.  Effect  of  Hardware  Constraint:  Design 
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MAN-YEARS  OF  CODING 


i 


1 • 10  100  1000 
’’  NUMBER  OF  OBJECT  INSTRUCTIONS,  thousands 


Figure  6.6.  Effect  of  Hardware  Constraint:  Coding 
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NWWER  OF  OBJECT  INSTRUCTIONS,  thousands 
Figure  6.7.  Effect  of  Hardware  Constraint:  Testing 
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Both  this  study  and  other  results  do  indicate  the  importance  of  this 
variable.  Another  explanation  might  be  that  the  three  data  points  are 
all  around  98  percent  of  capacity,  while  data  used  in  other  studies 
could  be  nearer  95  percent  constrained  (see  Sec.  7.5  for  further  details 
on  how  the  amount  of  constraint  bears  on  the  factor). 

6. 3. A Design  Changes 

The  discussion  and  hypothesized  relationships  of  Sec.  3 ignore  the 
impact  of  changing  requirements  during  the  life  cycle  of  software,  par- 
ticularly during  development.  Experience  with  software  developments 
has  indicated  that  changing  requirements  can,  in  themselves,  be  an 
explanation  of  variations  in  product  costs.  Because  of  such  things,  the 
delivered  product  size  understates  the  amount  of  software  which  may 
actually  have  been  written.  Thus  development  effort,  per  delivered 
instruction,  is  considerably  overstated. 

Data  was  not  available  in  sufficient  detail  to  analyze  the  direct 
impact  of  requirement  changes  on  Individual  life-cycle  activities.  The 
ADPREP  data  did  report,  by  project,  the  number  of  changes  during  develop- 

it 

ment.  Unfortunately,  there  was  little  discussion  of  the  nature  of  the 
changes  and  no  attempt  to  quantify  the  impact  of  an  individual  change. 

Thus,  the  only  measure  of  change  activity  during  development  was  the 
cumulative  number  of  changes  incorporated  into  the  software. 

In  an  attempt  to  use  the  available  ADPREP  data,  it  was  assumed  that 
each  change  that  occurred  during  development  modified  some  average  fraction 
of  the  software  that  was  Intended  for  delivery.  The  number  of  object 
Instructions  affected  by  changes  was  estimated  as  the  product  of  the  size 
of  the  delivered  code,  the  cumulative  number  of  changes  during  development, 

■k 

The  SAMSO  data  also  reported  on  changes,  but  could  not  be  used  in  this 
analysis  because  the  effort  expended  by  contractors  was  not  broken  down 
by  CPCI,  but  aggregated  over  the  entire  effort. 


and  the  estimated  fraction  of  the  code  that  changed  with  each  requirement 
change.  It  was  hypothesized  that  development  effort  was  linearly  related 
both  to  delivered  program  size  and  to  the  amount  of  code  that  changed 
during  development : 

^develop  * % ^ a^^Cobj . Inst . /lOOO)  + a2[(obj.  Inst./lOOO)  x no.  changes] 
The  above  equation  can  also  be  written  In  the  following  form 


MMjevelop  ~ ^o  ^ [obj*  Inst . /lOOO]  [a^^  + a2  x no.  of  changes] 

It  can  then  be  seen  that  a2  is  the  'added  effort  per  change  to  the 
coefficient  a^^  that  relates  programmer  productivity  to  program  size. 


Using  a subset  of  the  ADPREP  data  points  that  reported  on  changes 

to  the  development  project,  a regression  analysis  of  this  relationship 

2 

was  performed.  The  R value  is  certainly  promising,  and  indicates  that  the 
relationship  is  useful,  and  the  T value  for  a2  is  significant,  thus  indi- 
cating the  Importance  of  the  variable  in  explaining  costs.  The  coefficient 
a^  is  not  significant,  and  the  term  could  be  dropped. 


Coefficients  T Statistics 


a = 6.6102 
o 


a^  = 0.678 

a^^:  1.105 

a2  • 0.110 

a2:  3.998 

Other  Statistics 

R^  - 0.65 
F - 12.30 


These  results  can  also  be  used  to  provide  insight  about  the  average 
amount  of  the  program  that  Is  modified  with  each  requirement  change.  It 
should  be  obvious  that  the  effects  of  the  change  process  are  not  stationary 
over  the  development  lifetime,  and  that  the  expected  fraction  of  the  program 
changed  will  depend  on  when.  In  the  development,  a requirement  is  changed. 
For  example,  a change  occurring  during  the  design  phase  does  not  require 
that  any  code  be  rewritten.  It  does,  however,  require  that  the  design 
be  reassessed  to  determine  if  the  interfaces  and  interactions  of  the 


components  are  still  reasonable.  In  the  light  of  the  change.  On  the  other 
hand,  a change  during  coding  may  require  that  the  design  be  reviewed, 
and  that  new  code  be  Introduced  and  Integrated  with  the  system.  This 
would  clearly  require  more  effort. 

The  coefficient  a^  has  units  of  man-months  per  Instruction.  The 
coefficient  a2  has  units  of  (man-months  x fraction  changed) / (Instruction 
X number  of  changes) . By  taking  the  ralo  of  to  a^  , the  average 
fraction  of  the  delivered  program  that  would  be  altered  with  each  require- 
ment change  can  be  estimated.  For  the  data  cited  In  the  ADPREP  study, 
this  estimate  works  out  to  16  percent. 

Factors  related  to  changing  requirements  can  be  applied  In  two 
ways.  The  first  and  most  Important  application  Is  In  normalizing  daca 
that  will  be  used  to  establish  cost  estimation  techniques.  An  Initial 
cost  estimation  technique  Is  oriented  towards  predicting  the  costs  of 
a project  that  Is  expected  to  proceed  normally.  Developing  such  an 
estimator  from  data  In  which  changing  requirements  have  had  a signi- 
ficant Impact  on  cost  Is  clearly  the  wrong  approach.  A nominal  estimation 
technique  assumes  the  existence  of  a separate  estimation  technique  for 
unforeseen  design  changes.  A separate  estimator  for  design  changes 
would  allow  costs  and  benefits  to  be  considered  for  each  change  prior 
to  Its  Incorporation  Into  the  development. 

The  second  application  of  Information  about  requirements  changes 
Is  the  development  of  a cost  estimating  technique  for  life-cycle  cost 
that  requires  the  user  to  estimate  the  expected  number  of  changes  that 
will  occur  during  the  project  lifetime.  The  Impact  of  change  can  be 
anticipated  and  explicitly  Incorporated  Into  the  Initial  estimate  of 
costs.  Making  an  estimate  of  the  expected  changes  to  a project  requires 
that  the  number  of  changes  be  related  to  some  other,  more  easily  estimated, 
characteristics  of  the  project  such  as  duration  or  application  area. 

Without  such  an  auxiliary  relationship,  estimation  of  the  expected  number 
of  changes  would  be  nothing  more  than  a guess. 


a 
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6.4  ALTERNATIVE  PRODUCT  MEASURES 

Throughout  Sec.  3 of  this  report,  an  attempt  was  made  to  relate 
measures  of  the  product  of  a software  development  to  the  effort  and  costs 
associated  with  that  product.  The  most  obvious  and  commonly  suggested 
measure  is  software  size,  expressed  as  the  number  of  either  source- 
language  or  object- language  statements.  Each  seems  to  have  an  appropriate 
application,  and  it  is  Important  that  a consistent  measure  be  used  when 
comparing  data  on  various  projects. 

Inconsistency  in  this  measure  of  program  characteristics  is  probably, 
by  itself,  significant  in  explaining  variation  in  the  data  reported,  since 
it  could  vary  by  a factor  of  2 to  5 (as  discussed  in  Sec.  6.3.2  on  the 
impact  of  programming  language) . Since  there  is  no  way  of  checking 
most  reported  data,  the  possible  inconsistency  of  this  measure  must  be 
accepted . 

Assuming  that  all  measures  of  computer  programs  can  be  converted 
to  a normalized  form  through  a scaling  relationship,  a further  possible 
source  of  inconsistency  exists.  It  can  be  argued  that,  for  high-level 
languages,  programming  style  (i.e.,  coding  conventions  and  techniques) 
can  cause  the  relationship  between  source  code  and  object  code  to  vary 
significantly  from  programmer  to  programmer,  therefore  invalidating 
any  scaling  relationship  between  source  and  object  codes. 

The  AOES  data  described  a large  number  of  programs  in  terms  of  both 
number  of  source-code  card  Images  and  number  of  object-code  instructions 
generated  by  a JOVIAL  compiler.  This  data  offered  an  opportunity  to  test 
the  claim  that  significant  variations  in  the  relation  between  source- 
code  and  object-code  instruction  counts  are  introduced  by  programmer 
style. 
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The  sec  of  JOVIAL  programs  considered  all  run  on  the  same  computer, 
so  variations  Introduced  by  computer  Instruction  sets  are  removed.  Further- 
more, the  programs  perform  a wide  range  of  functions,  from  orbit  deter- 
mination to  compilation  and  operating  system  functions,  and  are  written  by  many 
authors.  Using  the  subset  of  programs  written  entirely  in  JOVIAL,  the  fol- 
lowing regression  analysis  was  performed: 

ob j . Inst.  - % '*’  ^^(card  Images) 

A very  strong  linear  relationship  was  found  between  source  and  object  code 
for  this  set  of  JOVIAL  programs.  The  statistics  of  the  regression  analysis 
are  very  significant. 

Coefficients  T Statistics  Other  Statistics 

- 0.94 
F - 5360.0 

These  results  Indicate  chat  the  claim  Is  Invalid  and  Chat  stylistic  varia- 
tions In  the  use  of  JOVIAL  become  Insignificant  over  the  length  of  an 
average  program.  The  data  used  and  the  results  of  the  regression  analysis 
are  shown  In  Fig.  6.8. 

This  suggests  that  a table  of  conversion  factors  for  language  can 
be  compiled  for  various  programming  languages  and  host  computers.  These 
factors  could  be  applied  to  construct  normalized  measures  of  product 
size.  Table  6.2  is  a compilation  of  such  data  based  on  the  Information 
used  In  this  study.  For  example,  the  above  equation  would  Indicate  that 
for  the  CDC  3800  JOVIAL  system,  a JOVIAL  statement  (assuming  one  per 
card)  results  In  an  average  of  2.4  object  Instructions.  The  table  Is 
far  from  complete,  and  should  be  extended  to  Include  other  commonly 
used  computer  systems  and  languages. 


a - -30.495 
o 

- 2.4  a^^:  73.21 
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Figure  6.8.  Object  Inetructlons  Ve  Number  of  Garde 


TABLE  6.2 
CONVERSION  RATIOS 


Name 

Machine 

LanxuaRe 

Source 

Oblect 

Ratio 

SAMS 

IBM  360/65 

COBOL 

10,685 

37,395 

3.5 

SCS 

IBM  360/65 

COBOL 

23,395 

81,883 

3.5 

TAAOS 

IBM  360/65 

COBOL 

15,466 

65,779 

4.3 

MACE 

B-5500 

COBOL 

17,792 

62,274 

3.5 

STATEM 

B-5500 

COBOL 

9,579 

33,516 

3.5 

ADMSS 

CDC  3300 

COBOL 

11,405 

19,720 

1.7 

CRFS 

CDC  3300 

COBOL 

5,288 

8,727 

1.7 

DIS 

CDC  3300 

COBOL 

3,572 

5,358 

1.5 

AOES 

CDC  3800 

JOVIAL 

839* 

1,972* 

2.4 

ADOBE 

IBM  740 

FORTRAN 

4,532 

22,500 

5.0 

TRAID 

CDC  7600 

FORTRAN 

VI 

OD 

269* 

3.5 

MAFR 

RCA  501 

COBOL 

25,000 

100,200 

4,0 

MILSTAMP 

UNIVAC  1050 

COBOL 

24,310 

97,250 

4.0 

MPAS 

IBM  360/30 

COBOL 

29,631 

85,148 

2.9 

A 

Expected  values  per  program 

, derived 

from  a regression  analysis 

that 

established  the  ratio. 


i 

i 
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Some  recommendations  should  be  made  regarding  the  future  collection 
of  program  sizing  data.  The  real  product  being  generated  during  a soft- 
ware development  Is  represented  by  source-code  statements.  It  therefore 
seems  reasonable  to  report  product  size  by  the  number  of  punched  cards 
(or  card  IsMges)  containing  source-code  statements.  However,  while  many 
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programning  languages  are  generally  punched  one  staceaenc  Co  a card  (e.g., 

FORTRAN  and  assembly  languages),  others  are  not  (e.g.,  JOVIAL  and  ALGOL). 

For  such  languages  It  is  possible  to  have  more  than  one  statement  per 
card.  In  these  cases,  Che  number  of  card  images  may  understate  Che  number 
of  statements. 

The  card  count  should  include  comment  cards.  Comments  can  be 
very  effective  in  describing  algorithms  and  aiding  in  debugging,  and 
they  should  be  considered  an  integral  part  of  Che  program  being  delivered. 

It  is  our  recommendation  that  programs  be  measured  in  terms  of  the 
number  of  source  cards  delivered.  It  is  Important  that  scaling  factors 
be  developed  to  relate  one  language  to  another  (e.g.,  FORTRAN  statements 
per  card  to  JOVIAL  statements  per  card) . For  assembly-language  programs, 
scaling  factors  should  also  be  developed  to  account  for  differences  in 
machine  instruction  sets.  Such  differences  can  also  affect  the  number 
of  statements  required.  These  factors  are  important  in  arriving  at 
normalized  measures  of  product  that  are  language-independent,  for  com- 
parison of  programs  written  in  different  languages  for  different  systems. 

Software  development  produces  other  products  than  software,  notably 
the  documentation  that  accompanies  the  software.  It  is  possible  that  a 
measure  of  documentation  can  serve  as  a surrogate  for  product  size 
measured  in  source-language  or  object- language  instructions.  To  test 
this  conjecture,  it  was  assumed  that  documentation  describing  a project 
was  related  to  the  final  product,  so  long  as  common  documentation  require-  | 

ments  and  formats  applied.  Relaxing  this  last  constraint  makes  any 
comparison  meaningless. 

The  AGES  data  provided  information  that  could  be  used  to  test  the 
validity  of  this  conjecture.  For  each  of  the  many  programs  examined, 
the  page  counts  for  Part  II  Specifications  (subject  to  common  formats 
which  are  described  in  MIL  STD  483,  Appendix  IV)  were  available,  as  well 
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as  aeasures  of  the  product  size  In  terns  of  the  number  of  source  state- 
■ents.  The  following  regression  was  perfomed: 

tnCsource  cards)  ■ a^  + a^^  tn(pages  of  Part  II  Specs) 
with  the  following  significant  statistics: 

Coefficients  T Statistics  Other  Statistics 

a - 7.37  - 0.56 

o 

Sj  - 1.403  15.9  F - 252.0 

Figure  6.9  Illustrates  the  results  of  the  regression  analysis. 

2 

The  relationship  has  a low  R value  that  indicates  it  should  not 
be  used  to  replace  existing  measures  of  software  size.  The  relationship 
might  be  improved  upon  by  considering  other  factors  such  as  the  type  of 
software.  (Recall  that  the  AOES  data  contains  a wide  variety  of  software 
types. ) 

A relationship  between  documentation  and  program  size  might  be 
used  to  screen  initial  estimates  of  program  size  at  the  time  of  CDR 
when  the  first  documentation  exists  in  draft  form.  Such  a relationship 
would  allow  the  original  estimates  of  software  sizes  to  be  compared 
with  the  sizes  of  corresponding  documentation.  This  comparison  could 
identify  initial  estimates  that  are  at  significant  variance  with  that 
predicted  by  the  size  of  their  documentation.  The  sizing  of  the  program 
product  might  be  subjected  to  review,  leading  to  a possible  revision  of 
project  costs  and  schedule.  One  should  view  such  a relationship  as  being 
unique  to  a project,  and  serving  only  as  a general  management  tool. 

The  relationship  also  offers  promise  in  directly  calculating  the 
cost  of  documentation.  Since  lines  of  code  are  related  to  number  of  pages, 
an  estimating  equation  for  documentation  could  be  calculated.  Alternatively, 
the  documentation  estimate  could  indirectly  be  related  to  lines  of  code  by 
estimating  it  as  a percentage  of  development  man-hours,  if  the  letter 
is  based  on  lines  of  code. 
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Figure  6.9.  Number  of  Cards  Vs  Number  of  Pages  of  Part  ZI 
Specifications 
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6.5  CHARACTERISTICS  OF  THE  MAINTENANCE  ACTIVITY 


Few  previous  cost  estimating  studies  of  software  have  examined  this 
activity  of  the  life  cycle  to  any  significant  extent.  In  this  study,  it 
became  apparent  that  the  entire  process  was  very  poorly  understood.  As 
discussed  in  Sec.  2,  the  maintenance  activity  for  software  has  two  com- 
ponents: (1)  error  correction,  and  (2)  improvement  or  refinement  of  the 
software  product.  Our  survey  of  the  literature  on  software  reliability 
indicated  two  trends  related  to  these  activities.  First,  the  improvement 
cycle  is  a regular  process  for  any  system  that  has  remained  usable  and 
responsive  to  users.  Second,  error  histories  of  software  projects  exhibit 
a cumulative  number  of  errors  that  increases  with  program  size,  and  a mean 
time  between  error  detections  that  increases  with  the  correction  of  errors. 
That  is,  software  becomes  more  reliable  with  increased  use  and  correction 
of  problems.  These  two  conjectures  were  used  to  guide  a number  of  analyses 
intended  to  confirm  these  conjectures  about  the  maintenance  process. 


The  maintenance  data  available  to  this  study  was  derived  from  two 
sources:  (1)  AOES  data,  and  (2)  interviews  with  the  Defense  Satellite 

Program  (DSP)  office  and  their  contractors.  Both  software  systems  are 
examples  of  ground-based  command  and  control  systems  that  are  in  the  opera- 
tion and  maintenance  phase  of  their  life  cycle.  It  was  possible,  in  each 
case,  to  assemble  data  describing  the  characteristics  of  the  maintenance 
process  (e.g.,  number  of  problems  reported,  dates  of  major  revisions,  etc.). 
However,  it  was  not  possible  to  relate  these  characteristics  to  the  effort 
and  costs  expended  on  particular  re^llslons  and  corrections. 


The  AOES  data  does  provide  some  insight  into  the  relative  proportions 


of  effort  devoted  to  error  correction  and  to  product  improvement.  The  data 

includes  the  allocation  of  direct  technical  man-hours  during  maintenance  for  , 
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the  AOES  project,  in  terms  of  the  activities  of  the  two  associate  contractors. 
(The  integration  contractor  is  not  considered  in  these  figures.)  One  of  the 
associate  contractors  spent  25  percent  of  its  man-hours  on  error  correction 
and  75  percent  on  product  improvement.  For  the  other  associate  contractor. 


the  division  was  33  percent  and  67  percent.  Thus,  for  both  contractors. 


most  of  the  effort  went  to  improvement  rather  than  error  correction. 


In  order  to  understand  the  maintenance  effort,  the  events  (revisions 
and  errors)  within  the  maintenance  activity  were  Investigated.  The  cost  of 
revisions  to  existing  versions  of  software  to  Incorporate  new  requirements 
could  presumably  be  estimated  by  techniques  similar  to  those  used  for  de- 
velopment activities.  Such  techniques  would  be  based  upon  characteristics 
of  the  newly  written  parts  of  the  code,  but  would  also  reflect  the  existence 
of  a body  of  code  and  previous  experience  that  would  not  be  present  in  a new 
development. 

Figure  6.10  Illustrates  histories  of  revisions  in  the  AOES  and  DSP 
projects.  It  shows  time  lines  for  DSP's  Overseas  Ground  Station  (OGS)  and 
CONUS  Ground  Station  (CGS)  programs,  and  AOES's  Satellite  Control  Facility 
(SCF)  System  Support  Tape.  Each  time  line  shows  the  major  releases  of  each 
system. 

For  CGS  and  OGS,  the  revisions  to  the  system  occurred  at  fairly 
regular  Intervals  until  version  6.  At  this  point  a very  extensive  reor- 
ganization of  both  systems  was  made  that  required  a longer  time.  For  SCF, 
the  minor  revisions  (Indicated  by  letters  appended  to  the  version  number) 
occurred  at  close  intervals  and  the  major  system  revisions  (SCF-15.0, 
SCF-15.1,  SCF-15.2)  occurred  at  much  longer  Intervals. 

The  point  is  that  a regular  pattern  of  modification  takes  place  and 
is  to  be  expected  during  the  operations  period.  The  pattern  observed  is 
quite  consistent  with  recently  published  data  describing  the  evolution  of 
other  large  software  systems. However,  these  modifications  have  to  do 
largely  with  product  improvement,  rather  than  the  Initial  development's 
success  or  failure  in  meeting  original  requirements.  Therefore  no  trade- 
off exists,  and  a decision  to  fund  the  Improvement  should  be  based  on  costs 
and  benefits  perceived  at  that  time. 

Error  correction,  on  the  other  hand.  Is  clearly  driven  by  the  problems 
reported  during  operation  of  a system  and  the  response  times  required  to  cor- 
rect the  reported  errors. 
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The  technical  literature  on  software  reliability  Indicates  that  the 
error  process  of  operational  software  exhibits  two  major  trends.  First, 
the  total  number  of  problems  encountered  with  a given  model  of  software  Is 
limited,  and  the  limit  is  related  to  the  size  of  the  software.  Second, 
the  mean  time  between  error  detections  increases  with  use  and  corresponding 
error  correction.  Figure  6.11  Indicates  both  of  these  trends. 

The  data  collected  for  the  AOES  project  were  at  a sufficient  level  of 
detail  to  allow  testing  of  some  of  these  assertions.  Each  release  of  the 
system  was  described  by  records  of  the  changes  made  to  produce  the  release, 
and  the  errors  encountered  with  operational  use  of  the  release.  Changes  to 
the  program  resulted  from  two  things.  First,  design  change  requests  (OCRs) 
could  be  incorporated.  Effectively,  these  are  engineering  changes.  Second, 
discrepancy  report  forms  (DRFs)  associated  with  the  base  system  used  In 
generating  the  revision  could  be  corrected  in  the  new  release.  These  un- 
corrected errors  could  have  been  temporarily  repaired  with  "octal  correctors," 
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or  been  deferred  by  using  operating  procedures  that  avoided  the  problem. 
Both  the  OCRs  and  DRFs  caused  the  source  code  forming  the  new  release 
to  be  modified.  Each  DCR  and  DRF  could  be  associated  with  unique  computer 
programs  In  the  system,  so  that  a complete  history  of  modification  could 
be  developed  by  program.  Errors  encountered  with  the  operational  use  of 
the  new  release  of  the  system  were  reported  as  DRFs  written  against  pro- 
grams In  the  new  release,  rather  than  the  base  system. 

It  was  first  hypothesized  that  the  parameter  "number  of  problems" 
relates  only  to  the  size  of  the  programs.  A regression  analysis  of  the 
equation 

No.  of  Problems  “ a^  + a^^  (no.  cards/lOOO) 

was  performed,  with  the  scatter  shown  In  Fig.  6.12.  Obviously  the 
explanation  was  not  very  good.  Examination  of  the  data,  however,  showed 
that  the  points  above  the  line  corresponded  to  those  programs  with  a 
significant  number  of  design  changes.  It  was  therefore  hypothesized 
that  the  number  of  problems  reported,  once  a release  of  the  system  be- 
came operational,  would  be  related  to  the  size  of  the  software  (residual 
errors),  the  new  requirements  introduced  with  the  release  of  the  modified 
system,  and  the  problems  corrected  within  the  base  system  by  the  new 
release. 

It  was  hypothesized  that  each  new  requirement  modified  an  average 
fraction  of  the  program,  and  that  each  modification  introduced  some 
number  of  new  problems  proportional  to  the  size  of  the  modification. 

This  was  represented  by  combining  the  two  factors  Into  a single  product, 
represented  by  the  constant  a2  in  the  following  equation.  It  was  further 
hypothesized  that  the  errors  corrected  In  a base  release  would  also 
modify  an  average  amount  of  the  program,  and  in  turn.  Introduce  new 
problems  with  numbers  proportional  to  the  size  of  the  change.  Again, 
the  two  considerations  were  combined  as  a product  and  represented  by 
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Figure  6.12.  Niunber  of  Errors  Vs  Number  of  Cards 
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the  constant  In  the  following  equation.  The  form  of  the  estimator 
representing  these  hypotheses  Is 

No.  Problems  • a -t-  a.  (No.  cards/1000) 
o 1 


+ a^  ((No.  cards /lOOO)  x No.  design  changes] 


+ a^  [(No.  cards/1000)  x No.  problems  corrected] 

The  following  statistics  Indicate  that  each  variable  In  the  above  equation 
Is  significant  In  explaining  the  reported  problems  with  the  software 
version  once  It  was  released  for  operational  use. 


Coefficients  T Statistics 


Other  Statistics 


a 

0 

- 0.3964 

®i 

- 0.368 

3. 

^2 

- 0.686 

*2* 

8. 

^3 

- 0.482 

®3-’ 

13. 

- 0.64 

017  F * 249.0 

193 
791 


The  literature  has  also  repotted  some  rules  of  thumb  regarding 
the  expected  number  of  problems  with  software  systems.  Figure  6.13  shows 
a collection  of  published  data  describing  software  reliability,  taken 
from  references  In  Appendix  B.  The  data  points  symbolized  by  dots  repre- 
sent the  cumulative  number  of  problems  reported  through  operational  use 
of  the  programs.  The  general  rule  of  thumb  applied  by  program  testing 
and  reliability  Investigators  Is  that  the  total  number  of  problems  will 
be  between  0.5  percent  and  1.5  percent  of  the  size  of  the  program.  This 
range  Is  shown  by  dashed  lines.  Note  that  most  of  the  dots  fall  within 
the  range. 


The  data  points  symbolized  by  squares  are  for  the  Bell  Telephone 
Electronic  Switching  System.  This  represents  a system  that  has  high 
reliability  requirements,  and  was  Issued  In  multiple  releases  after 
thorough  testing.  The  problem  reports  are  only  for  those  problems 
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encountered  In  operational  use.  Note  that  this  data  falls  far  below  w' 

the  trend  for  other  programs,  as  might  be  expected.  The  remaining 

points,  symbolized  by  triangles,  represent  data  reported  by  Logicon 

(Appendix  B - Rubey,  et  al.)  describing  the  effectiveness  of  verification 

and  validation  (V&V)  activities.  The  data  represent  the  number  of 

problems  encountered  and  resolved  during  a V&V  activity  for  a sample  of 

programs.  Notice  that  although  not  all  problems  have  been  removed,  a 

significant  fraction  of  the  total  that  one  might  expect  to  encounter 

are  removed  by  the  V&V  activity. 

These  results  show  a clear  relationship  between  characteristics  of 
the  computer  programs  developed  and  the  number  of  errors  that  one  expects 
to  encounter  in  system-level  testing  and  operational  use  of  the  software. 

The  literature  also  indicates  that  the  rate  of  encountering  such  errors 
should  decrease  with  time  and  corrections  to  the  software.  These  two 
pieces  of  information  are  the  basis  for  any  estimation  technique  for  the 
costs  of  the  software  maintenance  activity.  Since  no  cost  data  exist 
that  can  be  related  to  these  error  rates,  it  has  not  been  possible  to 
test  any  estimation  technique. 

There  is  one  other  facet  of  the  software  maintenance  activity 
that  remains  to  be  tested.  Intuitively,  the  choice  of  programming  lan- 
guage should  have  a dramatic  impact  on  maintenance.  The  goal  of  high- 
level  programming  languages  is  to  make  programs  more  readable  and  easily 
modified;  therefore,  high-level  language  programs  should  exhibit  lower 
maintenance  costs.  The  AOPREP  data  contains  information  that  allows  this 
hypothesis  to  be  tested.  For  each  program  reported  on,  the  staffing  of 
the  maintenance  function  is  described  in  terms  of  the  total  number  of 
maintenance  people  assigned  per  month. 

In  Fig.  6-14,  programs  written  in  machine  language  are  plotted 
with  dots  and  programs  written  in  a high-level  language  are  plotted 
with  squares.  It  is  generally  difficult  to  discern  if  the  published 
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Figure  6.14.  Halntenance  Requirements  for  Machine-Language  and  High- 
Order-Language  Programs 


data  represent  total  maintenance  staff  at  multiple  sites  or  the  main- 
tenance staff  assigned  at  a single  site.  There  is  some  question  that 
the  highest  three  points  may  be  reported  differently  than  the  remaining 
data.  If  they  are,  one  might  argue  that  the  maintenance  required  is 
independent  of  program  size.  It  could  also  be  argued,  equally  well, 
that  separate  trends  exist  for  machine  language  and  high-order  language, 
indicating  that  the  use  of  high-order  languages  makes  a significant 
difference  in  the  staffing  required  for  maintenance.  This  latter  hypo- 
thesis is  intuitively  appealing,  and  would  suggest  that  there  are  sub- 
stantial benefits  to  be  achieved  with  the  use  of  high-order  languages. 

A linear  regression  of  the  high-order- language  data  points  was 
performed.  Independently,  a linear  regression  of  the  machine- language 
data  points  was  performed.  Each  used  the  general  form: 

No.  Haintenance  Personnel  “ ®q  inst./lOOO) 

The  results  were: 

Machine  Language  Programs 

No.  Staff  - 1.14  + 0.19  (obj.  inst./lOOO) 

- 0.43 

High-Order  Language  Programs 

No.  Staff  - 0.34  + 0.05  (obj.  inst./lOOO) 

- 0.38 

The  regression  of  the  hlgh-order-language  cases  omits  one  data  point  with 
31  maintenance  personnel  assigned.  Apparently  the  exceptionally  high 
number  of  staff  is  explained  by  the  existence  of  replicated  maintenance 
facilities,  with  the  total  personnel  of  all  facilities  being  reported. 
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Note  also  that  the  R statistics  are  not  very  good,  and  the  relation- 
ships are  tenuous. 

In  these  two  regressions,  the  maintenance  staffing  for  machine- 
language  programs  Is  approximately  four  times  as  large.  This  number  is 
consistent  with  language  factors  derived  In  Sec.  6.3.2  In  this  report 
that  show  assembly  language  requires  2 to  5 times  the  effort  of 
high- level- language  programs  with  the  same  number  of  object  Instructions. 
It  also  shows  that  assembly  language  Is  not  used  for  larger  programs, 
probably  due  to  the  loss  of  efficiency. 

Based  upon  this  Initial  examination  of  the  maintenance  activity, 
several  guidelines  about  the  maintenance  process  can  be  stated.  The 
first  is  that  the  program- improvement  aspect  of  maintenance  is  a regular, 
repeating  process  for  programs  that  have  a long  useful  lifetime.  Because 
maintenance  staffing  appears  to  be  somewhat  arbitrary,  the  extent  to 
which  programs  are  Improved  and  the  schedules  for  this  type  of  maintenance 
are  probably  based  on  the  residual  manpower  available  after  error- 
correction  activities  for  the  currently  operational  version  of  a system. 

The  error-correction  aspect  of  program  maintenance  appears  to  follow 
some  general  trends.'  First,  the  total  number  of  errors  is  probably 
limited,  with  the  limit  related  to  program  size  and  the  number  of  modi- 
fications installed  in  the  program  during  maintenance.  Second,  the  MTBF 
for  software  will  Increase  with  time  and  corresponding  error  correction. 
Thus,  for  a static  program,  one  should  expect  to  be  able  to  taper  the 
maintenance  effort  with  time.  The  AOES  data  support  this  conjecture. 

If  each  release  of  the  system  did  not  stabilize  with  age,  then  main- 
tenance effort  required  should  grow  as  the  number  of  releases  being 
concurrently  supported.  There  Is  no  evidence  to  indicate  such  growth. 
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7 STATE-OF-THE-ART  TECHNIQDES  FOR  ESTIMATING  SOFTWARE  COSTS 

Although  this  study  did  not  result  in  the  development  of  CERs  for 
each  of  the  life-cycle  phases,  we  have  evaluated  a number  of  software 
resource  relationships.  However,  the  work  described  in  the  previous 
sections  is  not  very  useful  for  Program  Offices  unless  it  is  incorporated 
into  previously-developed  software  cost  estimating  techniques.  This 
section  briefly  describes  some  of  the  more  lnq>ortant  techniques  from  the 
literature  and  elsewhere  that  were  reviewed  during  this  study.  This 
Information  is  Intended  to  provide  some  insight  into  the  current  state-of- 
the-art  in  software  cost  estimating. 

This  was  not  a major  focus  of  the  study;  we  do  not  claim  this 
section  is  a thorough  review,  evaluation,  or  assimilation  of  all  software 
cost  estimating  techniques  that  have  been  developed  or  proposed.*  Rather, 
it  is  a composite  of  our  experience  which  incorporates  results  from 
other  sources.  No  software  cost  estimating  techniques  have  yet  demon- 
strated much  precision.  The  reader  is  advised  to  use  the  techniques 
very  carefully,  keeping  in  mind  their  acknowledged  imprecision. 

Two  basic  types  of  software  cost  estimating  models  are  discussed 
here.  The  first  is  a disaggregated  approach  in  which  the  costs  for 
some  of  the  major  life-cycle  phases  are  estimated  separately  and  then 
combined  to  form  an  estimate  of  the  total  cost  (Sec.  7.1).  The  second 
is  an  aggregated  approach  in  which  total  costs  for  a software  project 
are  estimated  and  then  allocated  to  the  life-cycle  phases  (Sec.  7.2). 

The  disaggregated  model  is  based  primarily  on  earlier  GRC  work;^^ 
the  aggregated  approach  has  been  used  for  most  software  cost  estimating 
to  date. 


* 18  9 

A number  of  reviews  have  previously  been  published.  * * 
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In  Sec.  7.3  we  discuss  various  special  factors  (language,  hardware 
constraints,  etc.)  which  significantly  lapact  the  cost  of  software  and 
in  Sec.  7. A we  discuss  how  to  tine-phase  aggregated  estlnates. 

The  estlnatlon  of  software  nalntenance  costs  is  discussed  in 
Sec.  7.5  and  concluding  resurks  are  presented  in  Sec.  7.6. 

7.1  BASIC  MODEL  (DISAGGREGATED) 

The  most  common  approach  to  software  cost  estimating  Is  to  utilize 

a basic  model  to  estimate  total  man-months,  computer  hours,  documentation, 

etc.  for  the  entire  project,  and  then  use  these  estlnates  to  derive 

the  total  project  cost.  However,  in  Spetember  1975,  under  the  sponsorship 

of  the  National  Aeronautics  and  Space  Administration,  General  Research 

Corporation  developed  a basic  model  which  employs  a disaggregated  rather 

22 

than  an  aggregated  approach  to  software  cost  estimating.  Two  basic 
models  were  developed:  an  "initial  model"  and  a "refined  model."  The 
initial  model  was  based  upon  selected  factors  and  relationships  from  the 
literature,  together  with  a number  of  data  points  and  observations 
developed  by  GRC.  The  second  model  was  based  upon  another  set  of  data 
compiled  by  GRC.  Each  of  these  is  described  below. 

7.1.1  Initial  Software  Costing  Model 

The  steps  required  to  develop  an  estimate  using  the  initial  soft- 
ware costing  model  are  as  follows: 

1.  Estimate  Total  Software  Size.  The  first  step  is  to  estimate 

the  number  of  object  instructions  to  be  developed  for  the  entire  system. 

In  developing  this  estimate,  the  required  functions  are  divided  into  sub- 

functions;  the  size  of  each  subfunction  is  estimated,  either  by 

analogy  to  a previously  implemented,  similar  subfunction,  or  by  a con- 
44 

straints  technique;  and  the  individual  estimates  are  combined  in*o 
an  estimate  of  the  total  size  of  the  software  system. 


2.  EatiBWte  Effective  Software  Sl«e»  The  esclaate  developed 

In  step  (1)  Is  modified  to  account  for  the  word  length  of  the  target 

computer  and  the  effects  of  design  change  activity.  The  model  described 

In  Ref.  22  assumes  a word  length  of  32  bits;  adjustments  are  made  If 

the  target  computer  has  a different  word  length.  Furthermore,  It  was 

estimated  that  design  change  activity  Increases  the  effective  size  of  the 

software  by  70  percent  (e.g.,  1700  Instructions  are  developed  In  order 
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to  deliver  a 1000-Instruction  program) . 

3.  Estimate  Coding  Labor.  Coding  labor  Is  estimated  from  the 
effective  software  size  estimated  In  step  (2).  Some  preliminary  data 

on  coding  productivity  are  presented  In  Table  7.1.  The  actual  coding  rate 
selected  should  be  based  on  the  complexity  of  the  software,  as  well  as 
Its  size.  If  different  portions  of  the  software  are  of  different  com- 
plexity, then  different  rates  should  be  used  for  each  portion. 


TABLE  7.1 

AN  INTERIM  CODING  PRODUCTIVITY  MODEL 
(Ref.  22) 


Project  Size 

' Instructions) 

Instructions 

Per  Man-Month 

Difficulty 

Total  Size 
<30,000 

Total  Size 
>30,000 

• Easy 

Batch 

Few  Interactions 

4,000 

2,000 

. Medium 

Adaptation  or 

Some  Interactions 

— 

1,000 

Difficult 

Real-Time 

Many  Interactions 

— 

300 
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4-.  Adjuat  Coding  Labor.  The  coding  labor  estlaated  in  step  (3) 

Is  next  adjusted  to  account  for  hardware  constraint  and  prograanlng  lang- 

22 

uage.  It  was  estimated  that  the  Impact  of  constrained  hardware  resources 
Is  approximated  by: 

Adjusted  Coding  Labor  • I ^ — \ x Coding  Labor 

\1  - ^ - 0.5/ 


where 

F ~ fraction  of  memory  used. 

22 

It  was  estimated  that  the  use  of  a hlgh-order  language  requires  25 
percent  more  memory  than  the  use  of  object  language,  but  that  it  reduces 
the  required  coding  labor  to  20  percent  of  that  required  by  the  use  of 
object  language,  for  each  object- language  Instruction. 


5.  Estimate  Total  Technical  Labor  Cost.  The  estimate  of  total 

technical  labor  cost  Is  based  on  the  estimate  of  coding  labor  developed 

22 

In  step  (A).  It  was  estimated  that  the  percentages  of  effort  devoted 
to  each  of  the  major  phases  of  software  development  are  as  follows: 

s Analysis  and  Design  (36Z) 

s Coding  (21Z) 

s Integration  and  Testing  (A3Z)* 

Based  on  these  estimates,  the  amount  of  labor  required  for  analysis, 
design.  Integration,  and  testing  may  be  estimated.  Totaling  the  labor 
estimates  for  all  three  phases  and  multiplying  by  the  estimated  labor 
cost  gives  an  estimate  of  the  cost  of  all  technical  labor. 


6.  Estimate  Computer  Costs.  Computer  costs  associated  with 
coding.  Integrating,  and  testing  the  software  are  assumed  to  be  proportional 

* “ ' 

This  Initial  model  does  not  Incorporate  any  tradeoffs  between  phases 
or  program  size.  These  refinements  are  Introduced  subsequently. 
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to  the  costs  of  labor.  They  are  estimated  to  add  104  percent  (in  1972) 
to  the  total  technical  labor  cost  estimated  in  step  (5) . 


7.  Estimate  Management  and  Documentation  Costs.  The  costs  of 
project  management  and  software  documentation,  like  computer  costs,  are 
assumed  to  be  proportional  to  the  costs  of  labor.  They  are  estimated  to 
add  28  percent  to  the  total  technical  labor  cost  estimated  in  step  (5) . 

8.  Sum  Costs.  The  final  step  is  to  sum  the  cost  estimates  of 
steps  (5),  (6),  and  (7)  to  determine  the  total  software  cost  estimate. 

The  main  problen  in  this  cost  model  is  that  small  errors  in  the 
first  four  steps  are  multiplied  in  the  later  steps. 

7.1.2  Refined  Software  Costing  Model 

A refined  software  model  was  developed  concurrently  with  the  model 
described  above.  In  effect,  the  refined  model  replaced  steps  (3),  (4), 
and  (5)  in  the  initial  model  with  the  following  means  of  estimating  total 
technical  labor  costs: 

1.  Estimate  Software  Analysis  and  Design  Labor  Cost.  The  following 
formula  was  used  to  estimate  this  cost: 


where 


In  Y = -1.17  + 1.03  in  + 1.73 


Y ■ man-years  of  effort 


X^  - number  of  object  Instructions,  thousands 

II,  If  software  development  Is  hardware-constrained 

0,  If  not. 


2.  Estimate  Coding  Labor  Cost.  The  following  formula  was  used 
to  estimate  this  cost: 


where 


In  Y - -2.3  + 1.13  In  + 2.12  X^ 

Y,  and  X2  are  defined  as  above. 

3.  Estimate  Integration  and  Testing  Labor  Coat.  The  following 
formula  was  used  to  estimate  this  cost: 

In  Y - -0.86  + 0.934  In  + 1.64  X2 

where 

Y,  X^,  and  X2  are  defined  above. 

The  development  of  these  formulas  was  based  on  data  from  seventeen 
software  development  programs,  twelve  of  which  were  proprietary.  The 
programs  Included  two  NASA  projects  and  a variety  of  military  space, 
aircraft,  and  missile  applications. 

7.2  BASIC  MODEL  (AGGREGATED) 

As  was  previously  stated,  most  approaches  to  software  costing  employ 
a basic  model  which  estimates  total  man-months,  computer  hours,  documen- 
tation, etc.,  from  which  total  project  costs  may  be  derived.  Two  types 
of  models  are  comoion,  those  that  are  based  on  the  size  of  the  software  to 
be  developed,  and  those  that  are  not. 

7.2.1  Estimates  Based  on  Size 

Most  of  these  software  costing  models  use  an  estimate  of  the  size 

of  the  software  to  be  developed  as  a basis  for  all  subsequent  calculations. 

Since  It  has  been  demonstrated  In  the  literature  that  cost  Is  not  a linear 
9 20  27 

function  of  size,  ' ’ especially  for  larger  programs,  estimating  rela- 

tionships for  larger  programs  should  use  nonlinear  relationships. 

The  steps  required  to  develop  a cost  estimate  based  on  the  expected 
size  of  the  software  are  as  follows: 


tr 
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1. 


Determine  the  Type  of  Software.  The  first  step  is  to  cate- 

19 

gorize  the  software  by  type.  Uolverton  selected  the  following  categories 
based  on  extensive  experience: 

e Control  routines 

e Input/output  routines 

e Pre-  or  post-algorithm  processors 

e Algorithms 

e Data  management  routines 

e Time-critical  processors 

19  28 

Both  Wolverton  and  Aron  also  recommend  that  software  complexity  be 
taken  into  account  by  estimating  the  percentage  of  software  that  fits  into 
each  of  the  following  categories: 

• Easy — very  few  Interactions  with  other  system  elements 

(e.g.,  applications  programs) 

e Medium — some  interactions  (e.g.,  compilers,  I/O  packages, 

utilities,  etc.) 

e Hard — many  Interactions  (e.g.,  operating  systems) 

2.  Estimate  the  Size  of  the  Software.  Size  is  normally  defined 

as  (1)  the  number  of  operational  instructions  (l.e.,  instructions 
that  will  perform  the  basic  functions  for  which  the  system  was  designed) , 
(2)  the  number  of  delivered  Instructions  (l.e.,  operational  instructions 
plus  supporting  instructions  such  as  hardware  diagnostics  and  debugging 
aids),  or  (3)  the  number  of  developed  instructions  (l.e.,  all  instruc- 
tions written  during  the  course  of  the  project,  whether  delivered  or  not). 

Several  techniques  (which  are  not  addressed  here)  are  currently 

available  for  estimating  software  size  (e.g.,  quantitative,  constraint, 

A A 

analog,  etc.)  Often,  different  techniques  are  used  for  different  por- 
tions of  the  software.  In  any  case,  however,  the  estimated  size  of  the 
software  should  be  partitioned  according  to  whether  it  la  (1)  new 


software,  (2)  existing  software  to  be  modified,  or  (3)  existing  soft- 
ware that  may  be  used  without  modification. 

28 

Aron  presents  a method  for  sizing  software  by  partitioning  it 
into  modules  whose  average  size  can  be  estimated  from  previous  experience. 
It  is  then  simply  a matter  of  multiplying  the  nuodier  of  modules  by  the 
average  module  size  to  arrive  at  a total.  An  average  module  size  of 
400-1000  assembly-language  instructions  is  comaon. 

3.  Estimate  Required  Labor.  From  the  size,  an  estimate  is  formed 
of  the  amount  of  labor  required  to  analyze,  design,  code,  check  out, 
integrate,  and  test  the  type  and  quantity  of  software  identified  in 
steps  (1)  and  (2) . There  are  two  basic  approaches  to  estimating  required 
labor:  productivity  measures  and  parametric  estimating. 

Productivity  measures  are  estimates  of  the  output  of  software 

development  personnel  in  terms  of  the  number  of  lines  of  code  per  unit 

A 

time  that  can  be  produced.  Table  7.2  presents  an  example  of  a produc- 
tivity table  developed  by  Aron.^®  Another  source^®  developed  the  following 

AA 

estimates  based  on  the  type  of  software: 

Productivity  (instructions 
Software  Type  per  man-month) 


Mathematical  operations 

166 

Report  generation 

125 

Logic  operations 

83 

Signal  processing  or  data 

reduction 

50 

Real-time  or  executive  functions 

25 

The  term  "line  of  code"  refers  to  a fully  checked-out,  tested,  and 
documented  statement  in  the  selected  language. 

[ 

These  estimates  Include  all  technical,  support,  and  management  labor 
for  the  project. 
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TABLE  7.2 

ya 

PRODUCTIVITY  TABLE 


a t ion 

6-12  Months 

12-24  Months 

More  Than 

24  Months 

Difficulty  ^ 

Easy 

(Very  Few  Interactions) 

20 

500 

10,000 

!ledium 

(Some  Interactions) 

10 

250 

5,000 

Hard 

(Many  Interactions) 

5 

125 

1,500 

Instructions 

per 

Man-Day 

Instructions 

per 

Man-Month 

Instructions 

per 

Man-Year 

30 

Nanus  and  Farr  developed  estimates  of  255  Instructions  per  man-month 

29 

for  operational  programs  and  311  for  utility  programs.  Another  source 

estimated  between  166  and  250  instructions  per  man-month  for  utility 

programs.  Estimates  presented  in  Sec.  6.3.1  were  33  instructions  per 

man-month  for  commercial  programs  and  100  for  defense-system  programs. 

31 

Figure  7.1  Illustrates  Boehm's  view  of  how  productivity  is  increasing 
over  time  as  a result  of  Improved  tools  and  techniques. 


Parametric  cost  estimates  relating  to  size  are  developed  by  plotting 

the  number  of  man-months  for  software  development  versus  the  number  of 

machine  instructions,  for  a representative  sample  of  past  projects,  and 

then  fitting  a curve  to  these  points.  Figure  7.2  is  an  example  of  such 
32 

a curve.  In  this  case,  the  estimate  of  labor  is  derived  through  the 
use  of  the  size  estimate  developed  in  step  (2).  If  the  type  and  complexity 
of  software  are  to  be  considered  when  using  this  technique,  then  an 
appropriate  set  of  curves  must  be  developed. 
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Flgur*  7.1.  Technology  Forecast:  Software  Productivity  (Ref.  31) 


4.  Adjust  Estimates.  The  labor  estimates  developed  In  step  (3) 
are  adjusted  to  account  for  special  conditions  of  the  project:  the 

use  of  a higher-order  language,  hardware  constraints,  personnel  qualifica- 
tions, etc.  Some  of  the  most  common  factors  for  these  adjustments  are 
discussed  in  Sec.  7.3. 

5.  Estimate  Cost  Distribution.  The  adjusted  labor  costs  are 
then  distributed  over  the  system  development  life  cycle  to  facilitate 
the  estimation  of  all  other  software  costs  (l.e.,  costs  other  than  for 
technical  labor).  Table  7.3  presents  the  actual  percentage  division  be- 
tween major  life-cycle  phases  for  several  major  projects,  as  well  as 
estimates  proposed  by  numerous  experts  in  the  field. 

Table  7.4  presents  various  related  rules  of  thumb  that  were  dls- 
9 

cussed  by  Morin,  The  reader  should  note  that  this  information  was 
taken  directly  from  tables  presented  in  Morin's  paper  without  the  accom- 
panying text.  Since  the  accompanying  text  presents  many  evaluations, 
restrictions,  and  caveats  concerning  these  rules  of  thumb,  the  reader  is 
cautioned  not  to  make  use  of  them  without  first  reviewing  Morin's  paper, 

6.  Estimate  Other  Software  Costs.  The  steps  up  till  now  have 
addressed  the  technical  labor  required  to  develop  the  software.  The 
sixth  step  Is  to  estimate  the  cost  of  associated  computer  time,  documen- 
tation, and  management. 

Computer  hours  may  be  estimated  as  a function  of  the  number  of 

man-months  for  software  development  or  as  a function  of  the  nundier  of 

30 

Instructions.  Figure  7.3  Illustrates  these  two  relationships.  In 
general,  estimates  of  four  hours  per  man-month  and  20  hours  per  1,000 
Instructions  are  common  in  the  literature. 


TABLE  7.3 

DISTRIBUTION  OF  SOFTWARE  EFFORT 
(Refs.  22.  29,  and  30) 


Source 

Analysis 
and  Design 

Coding  and 
Checkout 

Integration 
and  Test 

Projects: 

Apollo 

31X 

36% 

33% 

DAIS 

381 

15% 

47% 

Gemini 

36% 

17% 

47% 

NTDS 

30% 

20% 

50% 

OS/360 

33% 

17% 

50% 

SAGE 

39% 

14% 

47% 

Saturn  V 

32% 

24% 

44% 

SETS/BL 

42% 

18% 

40% 

Skylab 

38% 

17% 

45% 

Titan  III 

33% 

28% 

39% 

X-15 

36% 

17% 

47% 

Authors/Companies : 

Aron 

30% 

20% 

50% 

Boehm 

34% 

18% 

48% 

Brandon 

32% 

28% 

40% 

Brooks 

33% 

17% 

50% 

Farr 

35% 

18% 

47% 

GRC 

30% 

20% 

50% 

Kraus s 

47% 

16% 

37% 

Raytheon  (Business) 

44% 

28% 

28% 

RCA 

32% 

21% 

47% 

TRW  (C&C) 

46% 

20% 

34% 

TRW  (Scientific) 

44% 

26% 

30% 

Wolverton 

46% 

20% 

34% 

Approximate  Averages 

37% 

20% 

43% 

i 


! 
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TABLE  7.4 


B 


SOFTWARE  DEVELOPMENT  RULES  OF  THUMB 
(as  Identified  by  Mbrln^) 


Systea  Design  Phase: 

Analyze  Conputer  Prograa 
Production  Requlreaents 

Analyze  Slnllar  and  Interfacing 
Systeas 

Analyze  Requests  for  Systea  Change 

Design  the  Total  System 

Design  the  Computer  Program  System 

Familiarize  the  User  with  the 
System  Design 

Indoctrinate  Production  Personnel 

Coding  Phase: 

Develop  Program  System  Test  Plans 
Design  Programs 

Design  Program  Files 
Establish  System  Files 

Code  the  Programs 


3 to  9 weeks  for  a senior  programmer 


2 to  10  man-weeks  depending  upon  the 

nature  of  the  project 

Gross  Estimate:  5 to  202  additional 
costs,  10  to  ISZ  additional  for 
schedules 

1 to  3 man-months  depending  upon 

conditions  and  the  delays  experienced 

102  of  the  total  man-months 

3 man-days  per  design  document  per 
agency  contacted,  plus  allowances 
in  elapsed  time  for  travel 

With  analysts  turning  over  the  design 
to  programmers,  1 month  minimal 
Eormal  training  time  per  programmer. 
Without  handover,  training  costs 
minimal,  but  hidden. 


1 man-month  per  10,000  estimated 
machine  Instructions 

1 man-month  per  1000  - 2000  machine 
instructions. 

1 man-month  per  1000  instructions  for 
large  programs  (over  30,000 
instructions) 

1 man-month  per  10,000  items 

1 man-month  per  1,0,000  machine  instruc- 
tions fur  small  projects. 

2 man-months  per  10,000  machine  instruc- 
tions for  large  projects  (over  30,000 
Instructions) 

Gross  Estimate:  1 man-month  per  5,000 
machine  instructions 


T 


TABLE  7.4  (Continued) 


Integration  and  Testing  Phase: 

Learn  Test  Gnvlromaent  and 
Test  Procedures 


1 aan-week  per  prograiMer 


Test  Individual  Prograsa 

Test  Program  Subsystem 

Test  Program  System 

Implementation  Phase: 

Outline  User  Documentation 

Conduct  Demonstration  Test 

Analysis  of  Test  Results 


Anticipate  1 error  per  30  Instructions 

About  20  percent  of  testing  effort 

Approximately  1 error  for  each  125 
instructions 

3 to  10  trials  for  smaller  programs, 

100  distinct  trials  My  be  required 
for  a larger  program 

Between  0 to  30  percent  of  total  testing 
effort  depending  on  number  of  subsystems 

About  50  percent  of  total  testing  effort 
of  25  percent  of  the  total  effort 


Approximately  two  man-weeks  per  user's 
document,  plus  writing  and  editing 
costs  of  producing  outlines  and  plans. 

About  one  week  elapsed  time  for  a 
system  of  araderate  size. 

About  two  man-weeks  for  a system  of 
10,000  to  20,000  instructions. 

About  one  man-week  for  a system  of 
10,000  to  20,000  instructions. 


Documentation  of  Test  Results 


Approximately  two  weeks  for  drafting 
test  report. 


Aron^  cstlnates  that  a rate  of  7-8  hours  pet  man  per  month  over 

the  last  50-70  percent  of  a project  can  be  expected.  He  also  estimates 

that  2-3  hours  per  man  per  month  can  be  expected  during  Implementation 

27 

and  20  hours  per  man  per  month  during  system  testing.  Pletrasanta 
generally  agrees  with  this,  as  Illustrated  in  Fig.  7.4. 


Many  estimates  of  documentation  costs  are  currently  available. 

9 

Morin  summarizes  many  of  these  nicely » as  follows: 

e Approximately  10-35  pages  of  documentation  per  100  lines 
of  program  code 

e Two  man-months  per  user's  document  for  outlining 

e Three  to  five  pages  (750  to  1250  words)  per  man-day 
drafting 


• An  average  of  20  pages  per  man-day  for  technical  review 

a An  average  of  50  pages  per  man-day  for  editing 

a An  average  of  15  to  20  pages  per  man-day  for  typing 

a TWO  pages  per  man-day  for  Illustrations  (e.g.,  flow  charts) 

a An  average  of  10  pages  per  man-day  to  revise 

In  general,  the  cost  per  page  of  non-automated  documentation  is 
approximately  $35-$50.  It  is  estimated  that  this  accounts  for  10  percent 
of  the  total  project  cost. 

Costs  associated  with  the  management  of  the  project  are  estimated 

2g 

to  account  for  50Z  of  the  total  project  cost  by  Aron.  This  Is  nearly 
the  same  as  that  previously  estimated  by  GRC.^^ 

7.  Sum  Costs.  The  seventh  step  is  to  sum  the  cost  estimates 
developed  to  this  point  to  determine  the  total  software  cost. 

In  conclusion,  these  seven  steps  are  a generalised  approach  which 
attempt  to  Incorporate  the  basic  features  of  each  model  reviewed  in  the 
literature.  In  actuality,  a particular  model  may  Include  more  steps  and 
take  them  In  a different  order.  The  Intent  here  Is  to  provide  an  over- 
view of  the  approach. 

7.2.2  Estimates  Based  on  Parameters  Other  Than  Size 

In  general,  software  cost  models  %ihlch  do  not  employ  software 

size  make  use  of  steps  similar  to  those  just  described.  However,  all 

estimates  previously  based  on  the  number  of  Instructions  are  Instead  based 

on  such  parameters  as  those  shown  In  Table  7.5,  which  were  developed 

In  1970  and  were  designated  Automatic  Data  Processing  Resource  Estlmat- 
. 12* 

Ing  Procedures  (AOPREP) . This  study  was  preceded  In  1967  by  the  study 

* 

The  reader  Is  reminded  that  the  published  data  from  this  study  was  used 
extensively  In  Sec.  6. 
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TABLE  7.5 


DEVELOPMENT  MO)  INSTALLATION  COST  ESTIMATES 


12 


1 

j 


r 

t 

t 


Total  Number  of  Input  Type  Transactions  of  ADPS 
input  that  nornally  are  Identified  by  a unique 
tranaaction  code  and/or  unique  input  format. 

Total  Number  of  Different  Output  Formats  of 
ADPS  products. 

Total  Number  of  Record  Types  in  Data  Base  where 
a logical  record  type  is  a set  of  logically 
related  data  fields  Independent  of  the  physical 
manner  of  storage. 

Total  Number  of  Input  Data  Fields  that  are 
unique  in  content  and/or  format,  e.g.,  if 
there  is  a data  field  for  "name"  on  six  different 
card  formats,  the  number  of  unique  data  fields 
is  one. 

Average  Number  of  Transactions  per  Month  of 
Input , in  thousands,  originating  outside  the  ADPS. 

Average  Number  of  Input  Cycles  per  Month  for 
processing  input  data. 

Average  Number  of  Characters,  in  Millions,  in 
Data  Base  where  the  data  base  is  a collection 
of  files  that  contain  unique  information,  are 
accessible  to  the  ADPS,  and  are  normally 
referenced  or  updated.  Intermediate  files 
are  not  Included. 

Net  Growth  per  Month  in  the  Size  of  the  Dat£ 

Base,  in  millions  of  characters. 

Average  Number  of  Characters  per  ^onth  o^  Output, 
in  millions,  from  the  ADPS  destfned  to^seVs. 
Intermediate  output  of  the  ADPS  are  not  Included. 
Only  nonhlank  characters  are  counted. 

The  Inclusion  (or_Ex^u8lon^  of  On-Line  Query 
Processing  in  i^*'_AD^5r  ^sign  Code«f  as  folfows: 

1 • inclusion;  0 » exclusion. 

Tl^  Average  Number  of  Records,  in  Ml  1 1 ions. 
Contained  in  J_h^  Data  B.ise  Pj  1 es : In  termed  late 
files  are  not  included. 

Average  Number  of  Output  (^cles  jje^r  Month  for 
production  generated  by  the  ADPS  for  the” user. 
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reported  in  Ref.  6,  which  also  did  not  use  nunber  of  instructions  as 
a key  input  parameter. 


The  latest  study  which  resulted  in  the  development  of  estimatlnR 
relationships  not  strictly  employing  software  sixe  was  conducted  in  197A 
by  Frederic.^®  This  study  resulted  in  the  software  estimating  relation- 
ships presented  in  Table  7.6. 

It  is  Important  to  note  that  the  salient  feature  of  each  of  these 
methods  of  estimating  costs  is  simplicity.  A few  sliq>le  parameters  are 
estimated  and  then  used  in  a single  formula  to  calculate  total  software 
cost. 

The  Inherent  problem  with  these  types  of  models  is  that  it  is  diffi- 
cult to  Impose  adjustments  for  special  factors.  Also  their  accuracy  is 
very  poor,  perhaps  as  a consequence  of  theit  simplicity. 

7.3  ADJUSTMENTS 

Past  studies  have  identified  several  factors  which  strongly  affect 
the  cost  of  software  and  should  therefore  be  accounted  for  by  adjust- 
ments to  a cost  estimate.  Several  of  the  most  significant  factors  are 
addressed  below. 

7.3.1  Complexity 

Complexity  is  a function  of  the  machine,  the  language,  the  type  of 
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application,  the  size  of  the  project,  etc.  Aron  assessed  complexity 
as  easy,  medium,  or  hard  as  defined  in  Sec.  7.2.1,  step  (1),  Further- 
more, he  estimates  that  "medium"  is  approximately  two  times  more  difficult 
than  "easy,"  and  that  "hard"  is  four  times  more  difficult. 

7.3.2  Familiarity 

The  extent  to  which  assigned  personnel  are  familiar  with  a parti- 
cular application  can  have  a significant  Impact  on  the  amount  of  labor 


7-19 


SUMMARY  OF  PROVISIONAL  SOFTWARE  ESTIMATING  RELATIONSHIPS 


(Ref.  10) 


can  . 


required.  One  large  software  house'  estlsiaCes  that  unfamlllarlty 
decrease  the  normal  coding  rate  by  as  much  as  50  percent;  or  familiarity 
can  Increase  It  by  as  much  as  30  percent. 

7.3.3  Hardware  Constraints 

Hardware  constraints  were  previously  Identified  In  Sec.  6.2.3  and 

measures  were  made  of  their  Impact.  Figure  7.5  Is  another  Illustration 

of  the  effect  of  hardware  speed  and  memory  size  constraints  on  the  rela- 
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tlve  cost  of  software.  Our  results  (Sec.  6.3.3)  showed  a five  to  one 
growth  In  relative  man-hour  requirements,  which  compares  reasonably  well 
with  Fig.  7.5,  considering  that  utilization  was  approximately  98  percent. 

7.3.4  Interactive  Coding  and  Debugging 

The  use  of  Interactive  terminals  to  code  and  subsequently  debug 

software  can  significantly  reduce  the  effort  reoulred  during  the  coding 

and  checkout  phase.  It  Is  estimated  that  productivity  can  be  Increased 

33 

by  as  much  as  20  percent  through  their  use. 

7.3.5  Language 

The  use  of  hlgh-order  languages  such  as  FORTRAN  and  JOVIAL  has  dis- 
tinct advantages  over  conventional  assembly- language  coding.  Hahn  and 
34 

Stone  developed  the  following  estlm.ites  of  software  production  rates 
(Instructions  per  man-day)  for  several  hlgh-order  languages  and  for  machine 
language : 


FORTRAN 

Low 

3.3 

Average 

4.5 

High 

5.7 

COBOL 

4.5 

5.8 

7.2 

JOVIAL 

4.0 

5.7 

9.8 

Machine  Language 

6.6 

8.1 

9.7 

Each  statement  In  a hlgh-order  language  takes  more  man-hours 
to  produce  than  a statement  In  machine  language,  but  It  results  In  the 
generation  of  several  object-language  Instructions.  The  ratio  of  the 
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Figure  7.5.  Hardware  Constraints  Cause  Major  Software  Impact  (Ref.  31) 
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numbers  of  object-code  to  source-code  Instructions  Is  referred  to  as 
an  expansion  ratio.  In  Sec.  6.4,  we  showed  that  JOVIAL  (on  a CDC 
3800)  has  an  expansion  ratio  of  2.4.  Other  language/nachlne  combinations 
are  given  In  Table  6.2.  Thus,  use  of  a hlgh-order  language  will  reduce 
the  cost  per  object-code  Instruction,  even  though  the  cost  per  source- 
code  Instruction  Is  greater.  For  example,  the  object-language  production 
rates  for  JOVIAL  In  the  table  above  would  be  9.6,  13.7,  and  23.5. 

This  explains  the  findings  of  those  who  participated  In  the 
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Government /Industry  Software  Costing  and  Sizing  Workshop.  They  esti- 
mated that  the  use  of  a hlgh-order  language  could  halve  the  cost  per 

29 

Instruction.  In  addition,  a major  software  house  estimates  that  the 
cost  of  software  design,  coding,  and  checkout  can  be  reduced  by  as  much 
as  80  percent  through  the  use  of  hlgh-order  languages.  They  also  esti- 
mate that  the  use  of  macros  when  coding  in  assembly  language  can  result 
In  a saving  of  up  to  10  percent. 

7.3.6  Personnel  Qualifications 

The  qualifications  (as  well  as  the  quality)  of  the  persons  assigned 

to  a project  affect  the  amount  of  labor  reaulred.  Participants  in  the 
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Costing  and  Sizing  Workshop  estimated  that  there  Is  approximately  a 
5:1  variability  in  labor  requirements  depending  upon  a oerson's  skill. 
However,  this  Is  a very  difficult  parameter  to  determine  early  in  the 
development  cycle. 

7.3.7  Personnel  Utilization 

The  amount  of  time  actually  directed  towards  the  completion  of  a 
project  Is  not  100  percent  of  time  assigned  (l.e.,  utilization  Is  less 
than  100  percent) . Actual  utilization  Is  affected  by  such  things  as 
turnovern,  computer  down-time,  coffee  breaks,  etc.  We  estimate  that  a 
55  to  85  percent  utilization  rate  Is  not  unrealistic  for  most  cases. 
However,  when  a new  computer  system,  operating  system,  etc.  Is  Involved 
In  a software  development  project,  the  utilization  rate  can  drop  below 
50  percent. 


7.3.8  Staff  Sl»« 

20 

Brooks  contends  that  staff  size  is  a highly  critical  consideration 
because  of  the  required  interactions  between  personnel.  He  goes  on  to 
quantify  the  asK)unt  of  required  connninications  as  follows: 


where 


Comunlcatlons  Time  Factor 


n ~ number  of  personnel 


n(n-l) 

2 


This  formula  indicates  that,  for  example,  four  people  require  twice  as 
much  communication  as  three. 

29 

Another  source  estimates  that  a staff  consisting  of.  6-10  people 
can  be  very  effective,  but  that  a staff  of  more  than  20  people  can  re- 
quire as  much  as  three  times  the  time  and  effort  because  of  the 
communication  problem. 

7.3.9  New  Versus  F.xi8ting  Software 

The  software  developed  for  a particular  project  normally  consists 
of  (1)  new  software,  (2)  modified  existing  software,  and  (3)  existing 
software  which  may  be  used  without  modification.  The  costs  of  develop- 
ing new  software  have  been  discussed  throughout  this  report.  It  is 
obvious  that  the  cost  of  using  existing  software  that  requires  no  modifi- 
cation is  small.  The  cost  of  modifying  existing  software,  however,  is 
not  well  known. 

The  use  of  existing  software,  whether  modified  or  not,  significantly 

Impacts  coding,  integration,  and  testing,  and  particularly  analysis  and 
29 

design.  One  source  estimates  that  it  reduces  the  analysis  and  design 
effort  by  as  much  as  80  percent,  and  the  coding.  Integration,  and  test 
effort  by  as  much  as  20  percent. 
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7.3.10  Reouirement  ChanRes  During  Development 
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Requirement  changes  are  an  Important  part  of  software  cost  varia- 
tion which  to  date  had  not  been  addressed  quantitatively  with  any  success. 

12 

In  Section  6.3.4,  we  used  the  AOPREP  data  to  develop  a relationship 

between  man-hours,  lines  of  code,  and  number  of  changes.  The  relatlon- 
2 

ship  had  an  R of  .63  and  Is  shown  below. 


development  “ (object  lnstructlons/1000) 

+ .110  (object  instructions/ 1000  x no.  changes) 

It  Is  recognized  that  the  timing  of  a change  is  Important  in 
determlng  its  Impact.  However,  data  did  not  permit  exploring  that  di- 
mension. The  orecedlng  equation  estimates  an  average  oenalty  (Increaed 
man-months)  for  each  change. 

7.4  TIME-PHASING  THE  ESTIMATE 

An  estimate  of  the  total  cost  of  the  software  by  Itself  is  usually 
not  sufficient  for  use  by  the  Program  Office.  Costs  must  be  spread 
over  the  entire  development  life  cycle  to  facilitate  the  preparation  of 
a time-phased  budget.  As  a result,  an  estimate  of  the  total  time  allowed 
to  do  the  job  must  be  made. 

Although  specific  methods  of  calculating  the  ideal  time  for  a 
software  effort  are  scarce,  the  literature  does  agree  on  the  general 
shape  of  the  curve  which  plots  software  cost  versus  elapsed  time.  Meyer 
developed  the  following  curve: 


i 


] 


1 


/ 
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This  curve  shows  that  there  is  an  optlna  project  tlM  span  that 
•inialces  costs.  As  the  tlaie  decreases  below  this  optiaun  tloie,  costs 
Increase  and  eventually  rise  to  Infinity.  As  the  tlaw  Increases  above 
the  optiaun*  costs  Increase  due  to  the  extended  costs  of  a ainlnua  level 
of  personnel,  delays,  ECPs,  etc.  The  problem  is  that  it  is  extremely 
difficult  to  estimate  this  optimum  time  span. 

37 

Putnam  notes  that  manpower  costs  have  been  inversely  proportional 
to  the  square  of  time  for  those  projects  which  he  has  analyzed.  However, 
it  appears  that  he  has  concentrated  on  projects  which  were  time-constrained 
from  the  start  and  has,  therefore,  not  seen  the  increasing  part  of  the 
curve . , 

36 

Meyers'  estimating  technique  assumes  a project  duration  in  the 
range  of  17-30  months  which  is  further  divided  as  follows: 

a System  Design  (5-9  months) 

a Programming  (7-12  months) 

a Integration  and  Test  (5-9  months) 

19 

Wolverton  also  addresses  scheduling,  but  in  a fashion  that  derives 
the  project  time  span  after  the  project  costs  and  man-months  have  been 
spread  over  the  development  life  cycle. 

Once  total  development  time  has  been  defined,  several  rules  can 
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be  used  to  spread  the  cost.  Putnam  has  studied  the  distribution  of 
resource  consumption,  defining  the  distribution  of  man-hours  over  time, 
as  well  as  the  different  phases.  The  popular  40/20/40  rule  can  also  be 
used  to  spread  man-hours  over  the  three  major  phases  of  development: 

(1)  Analysis  and  Design,  (2)  Coding  and  Checkout,  and  (3)  Test  and  Inge- 
gratlon.  Table  7.3  shows  that  the  rule  is  within  the  ball  park  of 
many  recent  projects,  although  the  variance  is  fairly  large. 


1 


j 

i 
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Furthermore,  In  Sec.  5,  we  show  evidence  that  the  ratio  of  testing 
to  coding  decreases  with  program  size.  Relationships  using  the  NASA 
data  showed  that  as  the  program  size  Increased  to  25,000  lines  of  source 
code,  analysis  and  design  remained  around  41  percent  while  coding  In- 
creased from  17  to  21  percent  and  testing  decreased  from  42  to  38  percent. 

If  the  relationships  between  the  phases  are  developed,  then  they 

can  be  used  to  spread  man-hours  over  the  development  life  cycle,  using 

37 

the  time-phasing  of  activities  defined  by  Putnam  . In  this  case, 
changes  in  distribution  as  program  size  Increases  are  automatically 
incorporated,  and  the  use  of  the  40/20/40  rule  Is  avoided. 

7.5  SOFTWARE  OPERATIONS  AND  SUPPORT 

While  a significant  amount  of  research  In  the  area  of  software 

cost  estimating  has  been  devoted  to  the  software  development  process, 

much  less  attention  has  been  given  to  the  costs  associated  with  software 

operation  and  support.  Cost  data  for  defense  projects  Indicates  that 

operations  and  support  costs  can  significantly  Impact  overall  system 
20 

costs.  Brooks  estimates  that  the  cost  of  maintaining  software  is 

approximately  20-40  percent  of  the  cost  of  the  initial  software  d?>-velop- 

ment.  In  addition,  cost  of  making  changes  per  object  Instruction  (of 

the  altered  code)  can  run  from  10-100  times  the  original  cost  of  pro- 

38 

ducing  the  line  of  code. 

As  discussed  at  some  length  In  Sec.  6.5,  much  confusion  revolves 
around  what  one  calls  "maintenance."  Error  correction,  the  only 
portion  of  maintenance  that  can  be  attributed  to  the  original  code, 
accounts  for  less  than  1/3  of  the  technical  effort  of  software  mainten- 
ance projects  that  were  reviewed.  The  remaining  time  is  devoted  to 
product  improvement. 

Some  models  have  been  developed  to  predict  staffing  requirements 

12 

for  maintenance.  For  example.  Planning  Research  Corporation 


developed  such  a aodel  In  1966  based  upon  ADP  data.  Various  graphs 
presented  In  their  report  provide  estinates  of  (1)  the  required  nuiid>er 
of  aalntenance  personnel > (2)  the  required  nu^er  of  operations  personnel, 
and  (3)  the  dollars  per  Bonth  for  hardware  costs  of  progran  maintenance. 
The  Inputs  required  to  determine  these  estimates  are  (1)  characters 
per  month  of  Input  volume,  (2)  number  of  Input  fields,  (3)  characters 
per  month  of  output  volume,  (4)  number  of  output  fields,  and  (5) 
characters  in  the  data  base. 

In  Sec.  6.5,  we  presented  estimating  relationships  developed  from 
12 

the  ADPREP  data  base  which  suggested  much  lower  costs  for  maintaining 

2 

programs  written  in  high-order  languages.  However,  the  R values  are 
low,  so  the  relationships  should  be  used  cautiously. 

The  trouble  with  both  these  results  is  that  they  present  staffing 
models  which  include  product  Improvement.  It  is  our  conjecture  that 
productive  time  not  spent  correcting  errors  will  be  used  in  defining  prod- 
uct Improvements.  Thus,  requirements  for  original  staffing  tend  to  become 
self-fulfilling.  As  a result,  we  have  concentrated  our  efforts  on 
predicting  requirements  for  error  correction  only.  A model  has  been 
developed  in  Sec.  6.5  which  relates  the  cumulative  nuiid>er  of  problems 
to  the  number  of  cards,  number  of  design  changes  (ECPs),  and  the  number 
of  previous  errors  corrected.  In  effect,  we  assert  that  nuirijer  of 
errors  would  diminish  over  time  if  there  were  no  design  changes.  The 
resulting  equation  had  an  R of  .64  and  is  given  below: 

■ .396  + .368  (No.  of  source  cards/1000) 

+ .686  (No.  of  source  cards/lOOO  * No.  of  Design 
Changes) 

+ .482  (No.  of  source  cards/100  x No.  of  Previous 
Problems  Corrected) 


Cumulative  No. 
of  Problems 


Also,  we  examined  the  impact  of  V&V  upon  cumulative  errors  after 
acceptance  testing.  Data  presented  in  Fig.  6.13  supports  the  thesis 
that  V&V  discovers  25Z-50%  of  the  potential  errors. 


7.6  CONCLUDING  REMARKS 

In  this  section  we  have  identified  and  briefly  described  some 
of  the  more  Important  techniques  from  the  literature.  We  have  also 
Incorporated  the  more  useful  results  from  this  study.  We  hope  this 
information  will  be  useful  to  Program  Offices,  although  we  again  remind 
the  reader  that  there  is  little  precision  in  the  estimating  techniques 
developed  to  date.  Therefore,  each  technique  must  be  ue  ’ with  caution. 


The  inability  of  the  cost  estimating  community  to  generate  more 
precise  techniques  has  been  largely  due  to  the  difficulty  in  obtaining  a 
large  set  of  consistently  defined  data.  As  a consequence,  trade-offs 
within  developments  have  not  been  examined.  If  a cost  reporting  system 
is  implemented  by  the  Air  Force,  such  trade-offs  will  be  more  visible. 
Elements  of  such  a reporting  system  are  described  in  the  next  section. 


It  has  been  recommended  that  a consistent  estimating  technique 
be  used  so  that  departures  from  the  estimate  can  be  recorded  and  the 
estimating  techniques  Improved.  In  particular,  Devenny^  recommends 
the  use  of  the  Price  model,  for  which  a software  component  will  soon 
be  available.  It  is  our  opinion  that  a consistent  estimating  technique 
alone  Is  not  what  is  necessary  to  develop  better  cost  estimating  tech- 
niques. What  is  even  more  Important  Is  the  consistent  gathering  of 
actual  data,  and  a sound,  explicitly  analytic  approach  to  software 
cost  estimating  that  can  be  adapted  to  new  procedures  and  technological 
developments. 


r 


Basically,  not  enough  is  now  known  to  standardize  on  one  tech- 
nique. However,  if  standardization  is  required  we  recommend  that  it 
should  be  a technique  that  is  published  and  can  be  evaluated  by  the 
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software  coaminity.  Since  the  Price  model  is  proprietary  and  therefore 
not  published,  we  believe  that  it  should  not  be  selected  as  the  stan- 
dard. It  can,  of  course,  be  of  value  as  one  of  several  estimates  used 
to  check  for  reasonableness. 
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REPORTING  SYSTEM  ELEMENTS 
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8.1  INTRODUCTION 

The  second  major  goal  of  the  study  was  to  develop  a recommended 
set  of  software  reporting  system  elements  around  which  a uniform  data 
collection  system  can  be  designed.  Software  resource-consuaqttlon  data 
can  then  be  collected  from  varlotis  Air  Force  developments  consistently 
and  unambiguously.  This  data  can  ultimately  be  used  for  developing 
better  software  CERs. 

GRC  was  asked  to  Identify  only  the  elements  themselves.  The 
actual  reporting  system  (forms,  formats,  information  flows,  etc.)  was 
beyond  the  scope  of  our  study.  Hence,  this  section  only  addresses  the 
definition  of  the  elements  and  the  frequency  of  data  collection. 

However,  before  discussing  these  elements.  It  Is  useful  to  restate  the 
goals  of  the  reporting  system  and  the  resulting  principles  which  have 
guided  the  selection  of  elements. 

The  four  primary  goals  of  the  reporting  system  are  as  follows: 

1.  Provide  for  Cost  Control  by  Program  Offices.  The  main  short- 

run  benefit  of  the  reporting  system  will  be  the  visibility  It  affords 
the  Program  Offices  for  controlling  software  cost.  Currently,  a Program 
Office  receives  cost  Information  through  the  Cost  Performance  Report, 
a deliverable  of  the  Contract  Data  Requirements  List.  The  Information 
reported  Is  very  aggregated.  Some  projects  have  only  one  line  for 
software  cost  data;  others  have  more,  provided  that  more  than  one  Com-  | 

puter  Program  Configuration  Item  (CPCX)  has  been  defined  and  provided 
that  the  level  of  cost  reporting  extends  down  to  CPCIs.  Also,  reported 
costs  often  do  not  Include  software  engineering  costs  during  analysis 
and  design,  or  formal  testing  costs.  This  Inconsistency  in  definition 
Is  a major  contributor  to  apparent  cost  variability. 


2.  Provide  > Daf  Base  for  B«tter  Coat  E»tiiBat«».  The  long- 

run  benefit  of  the  reporting  system  will  be  the  development  of  s data 

base  on  software  costa  with  consistent  definitions.  The  usefulness  of 

such  a data  base  for  CER  development  Is  evident.  It  should  be  noted 

that  several  previous  studies  (as  early  as  1966)  have  defined  such  a 

2 3 7 

software  cost  reporting  system.  * * If  some  form  of  these  systems  had 
been  Implemented,  we  would  have  a fairly  good  data  base  today.  There- 
fore, the  exact  form  of  the  data  base  to  be  developed  la  probably  not 
as  Important  as  the  Implementation  of  a uniform  data  collection  effort. 

3.  Have  a Reasonable  Chance  of  Wide  Acceptance  by  Contractors. 
Perhaps  the  earlier  efforts  were  not  implemented  because  they  asked  for 
too  much  detail.  For  example,  the  referenced  studies  require  that 
redesign  and  recoding  (In  response  to  a discovered  anomaly)  be  recorded 
separately  as  additional  design  and  coding,  respectively.  This  Is  very 
difficult  to  do  and  leads  to  an  arbitrary  allocation  of  time. 

There  la  always  a tradeoff  between  (1)  the  scope  of  the  detailed 
data  collection  effort  required  for  future  cost  estimation,  (2)  the 
aggregated  Information  necessary  for  cost  control  and,  (3)  the  even  more 
aggregated  Information  that  contractors  will  readily  provide.  The  wider 
the  difference  between  (1)  and  (3),  the  more  likely  that  contractors 
will  resist  providing  the  Information. 

4.  Be  Based  on  Measurable  Items.  Detailed  costs  should  be 
gathered  around  typical  contractor  work  packages  and  not  artificial 
groupings  of  costs.  Furthermore,  these  work  packages  should  be  standard- 
ized so  that  comparisons  can  be  made  between  programs.  However,  the 
time  of  managers  and  other  Indirect  personnel,  who  are  not  assigned  to 
specific  work  packages,  should  not  be  artificially  allocated  to  these 
work  packages.  Data  should  be  collected  In  aggregate  and  allocations 
should  be  made  later  if  desirable  for  some  analysis. 


Using  these  goals,  we  developed  the  following  three  principles 
which  have  guided  our  selection  of  elements: 

1.  Resource  consumption  should  be  related  to  progress  towards 
the  completion  of  software.  This  Is  the  single  most  Impor- 
tant Item  for  cost  control.  Currently,  It  Is  assumed  that 
If  30Z  of  the  hours  have  been  spent,  then  30Z  of  the  work 

has  been  done.  Cost  overruns  attest  to  the  unsuitability 
* 

of  this  Idea. 

2.  The  level  of  detail  should  not  be  greater  than  what  Is 
specifically  needed  for  cost  control  and  the  prediction  of 
future  resource  consumption.  This  still  may  be  too  detailed 
for  contractor  acceptance,  and  tradeoffs  have  been  made  In 
element  definition  which  sacrifice  some  Information  for  this 
reason. 

3.  Data  requested  should  relate  to  the  actual  development 
process  and  not  require  artificial  allocations.  This  Is 
self-evident  and  a central  focus  of  the  definition  of  a 
Work  Breakdown  Structure  (WBS) . 

The  reporting  system  elements  were  defined  using  these  principles 
as  a guide.  Conceptually,  cost  elements  have  three  dimensions  (Fig.  8.1) 
the  software  end  Items  being  developed,  the  kind  of  resources  being 
consumed,  and  the  phase  of  the  life  cycle  In  which  the  resource  consump- 
tion takes  place.  All  three  dimensions  are  important  for  cost  control 
and  cost  estimation. 


* 

Reference  1 plots  cost  histories  for  several  BSD  software  developments 
against  original  estimates.  All  show  cost  growth,  some  by  more  than 


In  this  section,  each  of  these  dimensions  will  be  defined.  First, 
we  will  define  the  phases  of  the  life  cycle  (Sec.  8.2).  Because  of  its 
importance,  the  measurement  of  progress  towards  completion  of  the 
development  phase  will  then  be  discussed,  and  a set  of  data  elements 
describing  the  product  and  its  relation  to  the  phases  of  the  life  cycle 
will  be  specified  (Sec.  8.3). 


This  will  be  followed  by  our  definition  of  a standardized  set  of 
software  end  items,  the  second  dimension  of  Fig.  8.1  (Sec.  8.4).  Next, 
we  will  define  the  resources  consumed  and  a set  of  resource-consumption 
data  elements  (Sec.  8.5);  the  personnel  and  computer  resources  are 
discussed  more  fully  in  Secs.  8.6  and  8.7  respectively.  Section  8.8 
will  consider  the  feasibility  of  Implementing  this  reporting  system. 
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The  section  will  be  concluded  with  a discussion  of  two  special 
topics:  in  Sec.  8.9»  how  to  handle  changes  in  requirements  (ECPs);  and 
in  Sec.  8.10,  contract  work  that  precedes  full-scale  development. 

Throughout,  the  reader  should  keep  in  mind  that  the  development  of 
software  for  a weapon  system  is  not  easily  separated  from  the  development 
of  the  rest  of  the  weapon  system.  Occasionally,  software  is  the  end  item 
Itself,  or  a separate  subcontractor  is  responsible  only  for  software. 

In  these  cases,  software  cost  definition  and  data  collection  is  relatively 
easy.  However,  in  general,  software  will  be  only  a portion  of  the  devel- 
opment contractor's  work. 

8.2  LIFE-CYCLE  PHASES 

The  first  dimension,  that  of  the  life-cycle  phases,  is  the  same  as 
the  milestone  definitions  discussed  in  Section  2.1:  analysis,  design, 
code  and  checkout.  Internal  test  and  integration,  qualification  test, 
installation,  and  maintenance  (operations  and  support).  A simplified 
version  of  the  software  life  cycle  is  presented  in  Fig.  8.2,  showing 
the  relationships  between  phases,  the  relationship  of  phases  and  mile- 
stones, and  finally,  the  level  of  information  available  at  each  phase. 

Fig.  8.2  is  a simplified  version  of  Fig.  2.3,  specifically  excluding 
changes  In  requirements  (ECPs)  during  software  development. 


V( 

( 

The  specific  tasks  that  make  up  each  phase  are  identified  and  assigned 
to  specific  reporting-system  elements  in  Sec.  8.6.  These  reporting 
elements  will  sometimes  bear  the  same  title  as  a life-cycle  phase. 

When  this  happens,  the  activity  definition  rather  than  the  milestone 
definition  is  intended.  A table  is  given  (Table  8.14)  which  clarifies  the 
t%io  different  uses  of  these  words. 
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8.2.  Life-Cycle  Phaaee  end  Level  of  Iteportlng 


To  review:  the  software  development  life  cycle  begins  with  the 
Analysis  phase,  during  which  the  software  requirements  are  analyzed  and 
refined,  functions  are  defined  and  allocated  to  specific  software  end 
items,  and  Computer  Program  Configuration  Items  (CPCIs)  are  defined. 

Each  CPCI  Is  specified  In  detail,  including  test  requirements,  and  these 
requirements  are  documented  and  baselined  In  the  Part  I (Development) 
Specification  document  which  Is  reviewed  and  approved  at  (or  preferably 
before)  PDR. 

Each  CPCI  Is  further  disaggregated  during  the  Design  phase,  with 
functions  being  assigned  to  Computer  Program  Components  (CPC),  modules, 
and  subroutines,  and  detailed  flow  diagrams  being  developed  which  describe 
their  relation.  Part  II  (Product)  Specifications  document  this  work, 
which  Is  reviewed  at  CDR.  Part  II  Specifications  are  not  baselined  at 
this  time;  they  are  updated  periodically  through  the  development,  and 
finally  baselined  at  the  Physical  Configuration  Audit  (PCA). 

Coding  and  Checkout  of  source  code  then  proceeds  with  Individual 
modules,  routines,  etc.,  being  assembled,  compiled,  and  unit-tested  at 
different  times.  As  source  decks  are  completed,  they  are  integrated 
and  tested  with  other  modules  during  the  Internal  Test  and  Integration 
phase.  Redesign  and  recoding  are  accomplished  as  required  until  the 
entire  CPCI  is  ready  for  the  Qualification  Testing  phase,  which  com- 
pletes the  development.  Software  Is  then  reproduced.  Installed,  tested, 
and  modified  (If  required)  at  operational  sites  during  the  Installation 
phase.  The  software  is  maintained  with  error  correction  In  response  to 
error  reports  (DRFs),  and  changing  requirements  (incorporated  through 
ECPs)  which  form  a new  software  development. 

From  this  description  It  is  easy  to  see  that  different  levels  of 
Information  are  available  throughout  the  life  cycle.  During  analysis 
and  again  during  Installation  and  operations.  Information  Is  generally 
available  only  on  the  software  as  a whole,  and  Is  difficult  to  separate 
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from  Information  on  the  system  hardware.  During  the  other  phases,  the 
software  can  be  divided  into  end  items  (CFCIs)  as  specified  by  the  Part 
I Specifications,  and  resource  consumption  can  be  tracked  at  this  level. 
Further  detail  is  available  with  the  definition  of  Computer  Program 
Components  (CPCs)  and  modules  in  the  Part  II  Specifications.  Although 
some  costs  can  theoretically  be  tracked  at  the  module  level  during  coding 
and  debugging,  they  become  less  trackable  during  Internal  Test  and 
Integration  and  Qualification  Testing.  In  addition,  some  costs  can  be 
tracked  to  the  module  level  during  maintenance. 

Phases  are  defined  by  milestones:  Preliminary  Design  Review  (PDR) 
marks  the  end  of  Analysis;  Critical  Design  Review  (CDR)  ends  Design; 
the  start  of  Preliminary  Qualification  Test  (PQT)  completes  Internal 
Test  and  Integration;  the  completion  of  the  Physical  Configuration  Audit 
(PCA)  marks  the  end  of  the  development  process. 

It  would  greatly  simplify  reporting  if  the  phase  boundaries  marked 
the  end  of  activities  (e.g. , if  CDR  ended  the  design  activities).  In 
fact,  however,  activities  cross  these  boundaries.  While  it  may  be 
possible  to  capture  some  of  this  overlap  (e.g.,  continuation  of  initial 
design  until  it  is  first  baselined),  attempts  to  do  so  in  detail  may 
cause  reporting  difficulty  and  result  in  either  the  reporting  of 
estimated  times  or  rejection  of  the  whole  reporting  scheme  by  contractors. 
Separating  redesign  from  recoding  when  correcting  a bug  in  Internal  Test 
and  Integration  is  a good  example  of  this  problem. 

It  probably  will  be  possible  to  capture  planned  overlapping  of 
activities,  especially  in  the  earlier  phases  of  the  life  cycle.  Accord- 
ingly, we  recommend  that: 
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1. 


Specific  analyses  which  have  not  been  completed  by  PDR  but 
are  specifically  approved  at  PDR  for  later  completion  should 
be  assigned  to  the  Analysis  activity. 

2.  Initial  design  activities  which  have  not  been  completed  by 
CDR  but  are  specifically  approved  at  CDR  for  later  completion 
should  be  assigned  to  the  Design  activity.* 

3.  Coding  that  is  Initiated  before  CDR  should  be  assigned  to  the 

* 

Coding  and  Checkout  activity,  provided  that  It  was  specifically 
authorized  at  PDR. 

Capturing  this  Information  along  with  the  PDR  and  CDR  dates  will 
show  the  cost  consequences  of  planned  overlaps.  It  Is  recognized  that, 
by  definition,  PDR  should  end  the  analysis  activities,  CDR  should  end 
design  activities,  and  coding  activities  should  not  begin  before  CDR. 
However,  the  real  world  does  not  work  this  way.  It  Is  better  to  have 
approved  overlap  and  know  what  Is  happening,  than  unapproved  overlap 
due  to  regulations.  It  is  probable  that  some  overlap  is  cost-effective. 

Separating  redesign  and  recoding  activities  In  later  phases  of  the 
life  cycle  Is  another  story.  It  Is  very  difficult  to  make  this  separa- 
tion, and  we  recommend  that  no  attempt  be  made.  This  Is  a primary 

2 3 

difference  from  previously  recommended  reporting  systems.  ' We  specu- 
late that  one  reason  that  these  systems  were  not  Implemented  could  have 
been  the  requirement  to  separately  Identify  redesign  and  recoding  In 
these  later  phases. 


When  capturing  overlap,  reporting  elements  will  be  used.  These  report- 
ing elements  may  use  the  same  terms  as  the  life-cycle  phases,  but 
relate  to  activities  rather  than  milestones.  The  terms  have  been  used 
Interchangeably  In  the  11'  rature.  In  Sec.  8.6,  the  distinction  between 
the  two  uses  Is  clarified.  New  terms  should  perhaps  be  Invented  for 
one  set  of  these  definitions. 


We  also  reconnend  the  establishment  of  a new  milestone.  This 

milestone  will  occur  whenever  a module  has  been  coded  and  debugged,  and 

Its  source  deck  Is  baselined  for  the  first  time.  In  effect,  this 

becomes  a "moving  milestone"  and  will  provide  an  almost  continuous 

measure  of  product  completion.  The  establishment  of  the  source-deck 

baseline  Is  enhanced  If  a Program  Support  Library  la  established  for 
* 

the  project.  Ck>sts  and  man-hours  expended  on  a module  will  be  reported 
against  the  coding  and  checkout  phase  until  the  module  Is  baselined,  and 
thereafter  will  be  reported  against  Internal  Test  and  Integration. 
Contractors  have  reacted  favorably  to  this  Idea. 

We  feel  that  reporting  costs  and  man-hours  with  respect  to  these 
life-cycle  phases  will  not  require  artificial  allocations. 

8.3  SOFTWARE  PRODUCT  INFORMATION  AND  PRODUCT  COMPLETION  STATUS 

A clear,  concise,  and  accurate  description  of  the  software  product 
to  be  developed  Is  required  for  any  project  on  a historical  data  base. 
Too  often,  pertinent  Information  about  the  software  product  Is  lost  or 
111  defined.  Attempts  at  relating  resource  consumption  to  Ill-defined 
Input  variables  are  doomed  to  failure.  Lack  of  precision  with  respect 
to  these  Input  variables  Is  often  the  result  of  depending  upon  human 
memory  of  what  occurred  during  the  development,  years  afterward. 
Collecting  quantitative,  descriptive  data  on  the  software  development 
can  alleviate  this  uncertainty. 

An  equally  Important  problem  Is  to  relate  resource  consumption 
to  software  completion  status.  We  believe  this  Is  the  single  most 
Important  resource  reporting  requirement  for  cost  control.  Currently, 
the  Program  Offices  must  rely  on  reports  of  the  percentage  of  man-hours 
expended  to  ascertain  the  progress  of  development.  However,  the 


*Ref.  RADC  TR  74-300, 


Structured  Programming  Series 
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examples  in  which  percentage  of  man-hours  expended  meant  far  less  than 
the  equivalent  percentage  of  software  developed  are  too  num:rous  to 
place  much  faith  in  this  method.  Yet  the  Program  Offices  currently  have 
no  other  measure  of  product  status  between  CDR  and  the  beginning  of 
Qualification  Test. 
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We  recommend  that  software  description  and  status  Information  be 
collected  during  the  software  life  cycle  as  shown  in  Table  8.1.  Specific 
information  to  be  requested  is  shown  in  Tables  8.2  cuid  8.3. 

During  the  early  part  of  the  software  development,  before  CDR, 
product  descriptions  will  be  estimates.  The  level  of  product  definition 
will  continue  to  be  refined,  but  no  actual  code  will  have  been  completed. 

At  the  beginning  of  the  contract,  we  suggest  recording  the  original 
schedule,  purpose,  and  target  hardware  in  a permanent  data  base.  These 
data  will  be  reported  for  the  system  (segment)  as  a whole,  since  CFCI 
definitions  should  not  have  been  baselined  this  early  in  the  system- 
acquisition  cycle. 

At  PDR,  CPCIs  will  have  been  selected  and  baselined  along  with  the 
Part  I Specifications.  We  recommend  that  these  CPCIs  be  picked  from  (or 
consistent  with)  a standardized  list  of  software  end  items  called 
Computer  Program  Components  (CPCs).  It  is  important  to  gather  resource 
and  product  information  from  a standardized  list  of  end  items  so  that 
CERs  can  eventually  be  based  on  the  CPCs,  much  as  a hardware  cost 
estimate  is  now  made  for  the  component  parts. 


* 

A suggested  list  is  presented  in  Sec.  8.4. 
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SOFTWARE  PRODUCT  DESCRIPTION  AND  STATUS 
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ible  8.2,  paragraphs  so  lettered;  items  marked  (I)  only. 


TABLE  8.2 

SOFTWARE  PRODUCT  DESCRIPTION 


(I)  marks  data  to  be  included  In  Interim  reports 
(reports  1-Z  of  Table  8.1) 

Deacrintive  Information  for  the  Entire  Software 
(I)  Purpose 

Definition  of  CPCIs 
Definition  of  CPCs 
(I)  Description  of  Target  Hardware 

Milestone  Information  for  the  Entire  Software  Development  (I) 

For  each  of  the  following,  the  date  estimated  at  contract  award 
and  the  actual  date: 

PDR 

CDR 

Part  I Specifications  baselined 
Part  II  Specifications  baselined 
First  coding 
Initial  design  completed 
Initiation  of  qualification  testing 
PCA 

Descriptive  Information  for  each  CPC 
(I)  Name 

(I)  Brief  statement  of  purpose 

(I)  Parent  CPCI  (if  CPC  is  not  a CPCI) 

Names  of  programs  required  for  testing 

Any  automated  V&V  tools  utilized  during  development 

Target  hardware  (if  different  from  that  of  other  CPCs) 


TABLE  8.2  (Continued) 


Development  Method  for  Each  CPC 

(I)  Programming  language  or  languages 


I 


Host  assembler  and/or  compiler  If  other  than  target 

j 

machine 

Host  computer  if  other  than  target  machine 

Structured  programulng  technique  (if  any) 

/ 

Separate  V&V  contract  (if  any) 


Size  of  object-code  space  constraint  (if  any) 
Execution-time  constraint  (if  any) 

Delivery-date  constraint  (if  any) 

Budget  constraint  (if  any) 

■/ 

Percentage  of  the  implementation  based  on  requirements 
adapted  from  existing  system  (if  any);  identification 
of  system 

Milestone  Information  for  Each  CPC/(I) 

For  each  of  the  following,  the  date  estimated  at  PPR  (CDR  for  t items) 
and  the  actual  date  (where  not  t^e  same  as  in  B) : 

PDR 

/ 

. CDR  / 

First  module  baselined  f 

/ 

Last  module  baselined  t 

Last  assemb,ly  and/or  compilation  prior  to  Qualification 

Testing  t > 

* 

! 

Last  asscr'mbly  and/or  compilation  prior  to  PCA  t 
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TABLE  8 . 2 (Continued) 


Part  1 Specifications  baselined 
Part  II  Specifications  baselined 
First  coding 
Initial  design  completed 


F.  Code  Size  Measure  for  Each  CPC  (I) 

For  each  of  the  following,  the  PDR  estimate,  the  CDR  estimate, 
the  actual  at  first  baselining,  and  the  actual  at  PCA: 

Source  Code  - Total  lines  (by  programming  language) 

New  source  code 

Adapted  from  existing  code 

Existing  code  (unadapted) 

Object  Code  - Total  lines 

Compiled  from  source  code 

New  object  code 

Adapted  object  code 

Existing  object  code  (unadapted) 


Code  Changes  Since  Baseline  for  Each  CPC 


(I)  Number  of  compilations  prior  to  first  qualification 


testing 


(I) 


Number  of  assemblies  prior  to  first  qualification  3 

testing  I 

i 


TABLE  8.2  (Continued) 


Number  of  compilations  prior  to  PCA 
Number  of  assemblies  prior  to  PCA 

Note:  This  Information  could  be  expensive  to  track.  Require  only  If 
an  automated  data  retrieval  system  (library)  Is  In  use. 

H.  Code  Structure  for  Each  CPC  (I) 

For  each  of  the  following,  record  the  actual  number  of  Items  at 
first  baselining  and  at  PCA: 

Modules 

Non-overlapping  fields  In  the  data  base 

Input  formats 

Output  formats 

Unconditional  branches 

Conditional  branches 

Interfaces  other  than  data  I/O 

Note:  The  ultimate  purpose  of  these  measures  is  to  define  the  program  com- 
plexity, and  then  relate  It  to  cost.  The  above  represents  our  best 
judgement  of  the  minimal  list  required  for  this  purpose.  Much  has  been 
written  about  code  complexity,  and  a more  exhaustive  list  has  been 
compiled  by  IBM.^  The  composition  of  this  portion  of  the  required  data 
should  be  the  focus  of  continued  study. 

I.  Other  Size  Measures  for  Each  CPC 

Number  of  pages  In  Part  I Specifications 

Number  of  pages  In  Part  II  Specifications  Initially 
approved 


* 

Program  support  library — see  Structural  Programming  Series,  RADC  TR  74-300 
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TABLE  8.2  (Continued) 


Number  of  pages  in  Part  II  Specifications  at  PCA 

Number  of  test  procedures  in  qualification  test 
procedures  initially  approved 

Number  of  test  procedures  actually  executed  during 
PQT  and  FQT 

Execution  time  (as  compared  to  constraint) 


TABLE  8.3 


: [ 

: I 

REPORTING  ITEMS  FOR  CHANGES  TO  CODE 

^ Number  of  errors  detected 

Number  of  errors  corrected 

Number  of  compilations 

Number  of  assemblies 

; Total  lines  of  source  code  added 

Total  lines  of  source  code  modified 
I Total  lines  of  source  code  deleted 

( Change  to  object-code  size  due  to  source-code  changes 

\ i 

Total  lines  of  object  code  added  with  octal  correctors 

I Total  lines  of  object  code  dropped  (without  source-code 

change) 

' Reason  for  changes:  coding  error,  design  change,  new 

I requirement,  caused  by  another  software  fix. 

1 
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The  relationship  between  CPCIs  and  CPCs  Is  discussed  In  the  next 
section.  Suffice  It  to  say  now  that  CPCIs  are  generally  (but  not  always) 
more  aggregated  than  CPCs  and  that  their  relationship  should  be  known 
at  PDR. 


At  PDR,  the  CPCs  should  be  defined  (at  least  their  Interfaces  and 
performance  requirements).  This  definition  Includes  the  following  from 
Table  8.2: 


Section  C: 
Section  D: 

Section  £: 

Section  F: 


CPC  name,  purpose,  and  relationship  to  CPCI. 

Programming  language,  and  constraints  on  size,  speed, 
delivery  date,  and  budget. 

Milestone  estimates  for  CPC  development  (excluding  t 
items) 

Code  size  measures 


By  CDR,  the  product  definition  will  have  been  expanded  so  that 
modules  are  defined  for  each  CPC.  At  this  point,  the  PDR  estimates 
should  be  updated  for  each  CPC,  Including  milestone  estimates  for  (t) 
items  In  Section  E of  Table  8.2.  In  addition,  for  each  module  we  should 
have  a detailed  estimate  of  Its  size.  Section  F Information  from  Table 
8.2,  and  an  estimated  date  for  baselining  the  source  deck  (i.e.,  when 
source  deck  is  coded  and  checked  out  for  the  first  time),  together  with 
an  estimate  of  man-hours  to  develop  the  baselined  source  code.  This 
will  provide  the  Information  necessary  for  charting  software  develop- 
ment progress  through  the  Coding  and  Checkout  phase. 

Progress  towards  Initial  completion  of  the  source  code  should  be 
continually  assessed.  This  can  be  accomplished  by  using  the  "moving 
milestone".  Each  module  source  deck  should  be  baselined,  and  the  com- 
pletion data  should  be  reported  along  with  the  size  of  the  module 
(Sec.  F,  Table  8.2).  It  would  also  be  useful  to  record  the  composition 
of  the  code  (Sec.  H,  Table  8.2). 
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Tha  continued  reporting  of  source-code  completion  will  provide 
an  almoat  continuous  measure  of  product  completion  (e.g. , 30%  of  the 
source  code  has  now  bean  written  and  debugged).  Furthermore,  actuals 
can  be  checked  agalnat  estimates  to  see  If  man-hour  consumption  is  in 
line  with  estimates,  product  size  Is  within  projections,  and  coding  Is 
on  schedule. 

In  practice,  this  reporting  requirement  may  be  deemed  excessive 
by  the  contractor.  However,  requiring  delivery  of  the  source  code  as 
It  Is  completed  can  hardly  be  objected  to  and  will  provide  all  required 
Information.  Air  Force  project  monitors  can  analyze  each  source  deck, 
extract  the  Information,  and  verify  Its  completion.  Furthermore,  making 
this  a requirement  will  provide  great  Incentive  to  the  contractor  to  stay 
on  schedule. 

Although  not  appearing  In  Table  8.1,  changes  to  the  baselined 
source  code  for  each  module  ideally  should  be  tracked  continually, 
beginning  with  baselining  and  continuing  until  the  entire  CPCI  Is  ready 
for  FQT.  Reported  Information  would  Include  for  each  module  the  number 
of  errors  detected,  number  of  times  code  changes,  total  lines  of  code 
added  and  deleted,  and  current  module  size.  Such  Information  would 
allow  the  Program  Office  to  spot  problems  quickly. 

Tracking  changes  Is  technically  feasible,  If  the  library  concept 
Is  being  used,  but  would  probably  be  considered  excessive  by  contractors. 
We  therefore  do  not  recommend  the  regular  reporting  of  this  Information. 
However,  the  capability  and  right  to  sample  library  records  periodically 
would  accomplish  the  same  control  objectives.  Properly  done.  It  could 
also  provide  enough  Information  for  cost  prediction. 

Prior  to  Qualification  Testing,  a complete  description  of  each 
CPCI  should  be  prepared.  This  would  Include  Information  regarding  the 
actual  code  developed  (Items  marked  (I)  In  Secs.  C-H  of  Table  8.2)  and 
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previously  reported  as  estimates,  together  with  the  number  of  compila- 
tions and  assemblies  perfonmed  during  Internal  Test  and  Integration. 


During  Qualification  Testing,  the  Government  can  capture  changes. 
Accordingly,  we  recommend  that  the  data  of  Table  8.3  be  captured  for 
each  CPC  on  a monthly  basis.  This  will  provide  a great  deal  of  visi- 
bility during  this  critical  period. 

At  PCA,  a detailed  set  of  measurements  describing  the  product 
should  be  completed  and  Included  In  the  project  history.  We  recommend 
that  the  product-description  data  elements  Identified  In  Table  8.2  be 
collected.  This  need  not  be  a contractor  requirement  since  the  analysis 
could  be  performed  by  Air  Force  personnel  on  the  delivered  product. 

If  changes  to  the  code  are  required  during  Installation,  these 
changes  ought  to  be  noted  by  updating  the  CPCl  description  data  for 
the  affected  site. 

Finally,  during  Maintenance,  changes  to  the  code  In  response  to 
errors  should  be  recorded  monthly.  We  recommend  that  the  types  of  data 
Identified  In  Table  8.3  be  reported.  New  data  can  be  prepared  which 
summarizes  the  monthly  changes  by  CPC  whenever  a new  version  (In 
response  to  an  ECP)  Is  Incorporated  Into  the  system. 

Accumulating  this  product  Information  should  not  only  provide  a 
significant  Improvement  In  control,  but  will  also  provide  the  product- 
description  Information  necessary  to  develop  good  CERs. 

8.4  END  ITEMS 

The  second  cost-element  dimension  (Fig.  8.1)  is  the  end  Item.  As 
previously  mentioned.  It  Is  Important  to  establish  a standardized  set 
of  software  end  Items  which  can  be  used  to  define  the  software  develop- 
ment. Then,  If  resource  consumption  and  product  Information  are 


collected  agelnet  this  list  for  a nuiid>er  of  developments,  sufficient 
Information  will  be  accumulated  to  develop  CERs  for  the  software  end 
items.  Ultimately,  software  can  be  described  by  Its  parts,  much  like 
hardware,  and  estimates  of  the  total  package  may  be  based  upon  estimates 
of  the  Individual  parts.  Also,  such  an  end-item  list  will  give  the 
Program  Offices  greater  visibility  into  the  software  development.  One 
budget  line  item  for  software  is  simply  not  enough  visibility  for  cost 
control. 

The  end  items  should  be  functionally  oriented  so  that  they  can 

realistically  form  contractor  work  packages.  New  requirements  may 

require  additions  to  this  end-item  list,  but  at  least  the  standardized 

list  will  provide  a basis  for  gathering  reasonably  comparable  informa- 

e 

tlon  across  development  programs. 

The  end  items  have  been  designated  Computer  Program  Components 
(CPCs)  and  it  is  anticipated  that  CFCIs  will  be  made  up  of  groups  of 
CPCs.  Standardizing  a CPCI  list,  which  is  conceptually  similar,  was 
viewed  as  too  restrictive.  CPCIs  should  be  based  upon  the  level  of 
control  identified  in  the  Part  I Specifications  and  specific  test  plans. 
Occasionally,  for  high-risk  items,  a CPCI  could  be  a part  of  a CPC. 

For  example,  a particular  JOVIAL  compiler  may  be  both  high-risk  and 
critical  to  the  development,  and  it  may  be  desired  to  make  it  a separate 

CPCI.  In  this  case,  if  "all  compilers"  is  a CPC,  as  we  are  suggesting, 
then  at  least  two  CPCIs  should  be  defined;  one  for  the  JOVIAL  compiler 
and  one  for  the  remaining  compilers. 

Our  suggested  CPC  list  is  given  in  Table  8.4,  with  definitions 
of  terms. 

* 

If  a particular  end  item  is  being  developed  on  mope  than  one  computer 
system,  then  it  should  be  split  into  separate  end' items  and  tracked 
separately. 
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TABLE  8.4 

COMPUTER  PROGRAM  COMPONENT  LIST 


1 SUPPORT  SOFTWARE 

Supports  the  development,  testing,  operation,  and  maintenance  of  the 
applications  software. 

1 . 1 EXECUTIVE 

Coordinates  the  operation  of  the  computer  hardware  for  a particular 
application,  often  referred  to  as  the  "operating  system." 

1.1.1  Computer  Resource  Manager 

Performs  the  overall  control  function  for  a computer  system  in  a 
supervisory  mode.  Includes  such  features  as  an  I/O  supervisor,  a permanent 
file  manager,  a CPU  scheduler /memory  manager,  etc. 

1.1.2  Computer/Peripheral  Interface 

Coordinates  and  controls  the  internal  transfer  of  data  between  the 
CPU  and  peripherals:  disks,  tape  drives,  card  readers,  printers,  etc. 

1.1.3  Computer /Opera tor  Interface 

Coordinates  and  controls  the  transfer  of  data  between  the  CPU  and 
data  display/entry  devices  which  constitute  the  man-machine  interface. 

1.1.4  Computer/Computer  Interface 

Coordinates  and  controls  the  internal  transfer  of  data  between  CPUs 
within  the  system. 

1.1.5  Computer/Special  Device  Interface 

Coordinates  and  controls  the  internal  transfer  of  data  between  the 
CPU  and  such  special  devices  as  sensors,  communications  multiplexers,  etc. 

1.1.6  System  External  Interface 

Coordinates  and  controls  the  transfer  of  data  between  the  system 
and  other  systems  external  to  the  system. 

1.1.7  System  Failover  and  Recovery  , 

Coordinates  and  controls  all  internal  error  handling,  system  failure 
reconfiguration,  and  recovery  procedures  required  to  continue  system  opera- 
tion, even  if  in  a degraded  mode. 

1.1.8  Performance  Monitoring 

Collects  appropriate  system  performance  data  (on-line)  for  subsequent 
reduction  and  analysis  (off-line). 

1.2  UTILITIES 

Perform  auxiliary  functions  (off-line)  such  as  reading  cards,  print- 
ing, transferring  files  from  device  to  device,  performing  housekeeping 
functions  required  for  file  residence  on  mass  storage  devices,  etc. 


TABLE  8.4  (ConCd.) 
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1.3  LANGUAGE  PROCESSOR 

Converts  progrsu  in  one  language  to  those  In  another. 

1.3.1  Assenbler 

Ooerates  uoon  a 8Vid>ollc-lanKuaKe  proRram  to  produce  a Machine- language 
progran.  The  syabollc  Instructions  generally  correspond  one-to-one  with  aachine 
instructions.  An  asssabler  does  not  nake  use  of  the  logicsl  structure  of  the 
program. 

1.3.2  Compiler 

Operates  upon  a symbolic-language  program  to  produce  a machine-language 
program,  using  extensive  syntactic  analysis.  The  compiler  makes  use  of  the 
logical  structure  of  the  program  and  generates  more  than  one  machine  instruc- 
tion for  each  symbolic  Instruction. 

1.3.3  Interpreter 

Translates  and  executes  each  source-language  instruction  before  trans- 
lating and  executing  the  next  one.  In  the  case  of  looping,  the  translation 
Is  repeated  for  each  Instruction  In  the  loop  each  time  control  passes  through 
the  loop. 

1.3.4  Translator 

Translates  statements  In  one  symbolic  langauge  to  equivalent  statements 
in  another  symbolic  language.  The  translation  may  be  many-to-one,  one-to- 
many,  or  one-to-one.  Macro-processors  and  pre-compilers  are  Included  In  this 
group. 

1.4  LOADER 

Controls  the  reading  of  programs  and  data  for  Input  to  a computer,  either 
for  storage  or  for  Inanedlate  use. 

1.4.1  Bootstrap  Loader 

Initiate  the  reading  of  essential  support  software  under  "cold-start" 
conditions. 

1.4.2  Linkage  Editor/Loader 

Produces  a binary  form  of  a program  in  which  all  symbolic  references 
between  different  segments  have  been  replaced  with  binary  addresses,  and 
loads  the  binary  program  Into  core  memory  for  execution. 

1.5  HARDWARE  DIAGNOSTICS 

Detects  and  isolates  hardware  malfunctions  and  demonstrates  the  proper 
functioning  of  the  main  processor,  peripherals,  '’Isplays,  and  data  entry 
devices,  as  well  as  the  Interfaces  between  the  system,  subsystems,  and  external 
systems. 
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TABLE  8.4  (Contd.) 


1.6  SYSTEM  SIMULATIONS,  TESTS.  AND  EXERCISES 


Identifies  potential  areas  for  software  Improvenent , detects  and  Isolates 
software  malfunctions,  and  demonstrates  the  proper  functioning  of  the  software. 


1.6.1  Operational  Scenario 

Generates  test  situations  by  simulating  "real-%iorld"  Inputs  to  the  system. 


1.6.2  Control  and  Sequencing 


Coordinates  the  generation  and  presentation  of  simulated  Input  data  to 
the  system. 


1.6.3  Data  Collection 


Collects  and  records  data  regarding  the  operation  of  the  system. 

1.6.4  Data  Reduction  and  Analysis 


Reduces  the  collected  data  and  performs  various  types  of  mathematical 
analyses  upon  which  to  base  an  assessement  of  the  operational  efficiency, 
effectiveness,  etc.  of  the  system. 


1.7  SOFTWARE  DEVELOPMENT  AIDS 


Supports  the  development  of  applications  software,  but  Is  not  part  of 
that  software  (e.g.,  debugging  aids,  data  base  analyzers,  timing  analyzers, 
compiler  writing  systems,  etc.). 


1.8 


PORJECT  MANAGEMENT  AIDS 
Supports  project  management. 

1.8.1  Schedule  Maintenance 


Prepares,  updates,  and  projects  event,  milestone,  and  overall  project 
schedule  to  assist  management  In  controlling  the  effort. 


1.8.2  Financial  Accounting 


Maintains  detailed  accounting  records  on  manpower,  computer  resources, 
overhead,  etc. 


APPLICATIONS  SOFTWARE 


Performs  the  specific  tasks  and  functions  for  which  the  system  was 
developed. 


2.1  AVIONICS  SOFTWARE 


Controls,  both  automatically  and  in  conjunction  with  the  pilot,  all 
aspects  of  the  operations  of  an  aircraft  performed  by  computer. 
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TABLE  B.A  (Coned.) 


2.1.1  Mlsslor.  Plannlnn 

Cootdlnatea  the  operation  of  all  avionics  software  to  ensure  that  the 
mission  Is  accomplished  In  an  effective  manner.  Hay  provide  any  or  all  of 
the  following  functions: 

Preplanned  Mission  Evaluation 
Real-Tlaw  Mission  Modification 
Steering  Coordination 
Weapon  Delivery  Coordination 
Waypoint  Sequencing  and  Mode  Selection 


2.1.2  Navigation 

Maintains  an  awareness  of  the  position,  course,  and  distance  traveled. 

Deals  with  any  or  all  of  the  following  navigational  techniques: 

TACAN 

LORAN 

Doppler  Radar 
Inertial  Reference  Unit 
Auxiliary  Attitude  Reference 
Air  Data  Computations 
Kalman  Filter 

2.1.3  Aircraft  Steering 

Coordinates  the  flight  control  software  so  that  the  aircraft  may  be  steered 
either  automatically  (autopilot)  or  by  the  pilot  (responding  to  displayed  data). 
May  provide  for  any  or  all  of  the  following  steering  modes: 

Course  Select 

Manual  Course 

Instrument  Landing  System 

Airborne  Instrument  Landing  and  Approach 

Data  Link 

TACAN 

2.1.4  Flight  Controls 

Responds  to  the  aircraft  steering  Inputs  to  control  attitude,  speed, 
accelerations,  etc.  Manages  the  following  types  of  controls: 

Roll,  Pitch,  and  Yaw  Controls 
Velocity/Acceleration  Control 
Air  Induction  Control 
Energy  Management  and  Control 
Cockpit  Environment  Control 
Crew  Flight  Input 


1 
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TABLE  8.4  (Contd.) 
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2.1.5  Weapon  Delivery 

Controls  Che  delivery  of  ballistic  weapons,  unpowered  or  powered  teminal 
guided  weapons,  and  powered  mid-course  guided  weapons.  May  provide  any  or  all  of 
the  following  functions. 

Continuous  Computation  of  Impact  Point 
Weapon  Miss  Distance  Computation 
Automatic  Release  Control 
Visual  Release  Control 
Stores  Management  and  Control 

2.1.6  Sighting.  Designation,  and  Fixtaking 

Computes  ranges  and  angles  to  targets,  destinations,  and  other  points; 
accents  tracking-handle  inputs  which  generate  position,  velocity,  and  heading 
error  data  for  use  by  the  navigation  software;  computes  coordinates  of  terrain 
features;  and  calibrates  system  altitude  and  height  above  target.  May  provide 
for  any  or  all  of  the  following  techniques. 

Forward  Looking  Radar 
Astro  Tracker 
Laser  Spot  Seeker 
Low  Light  Level  Television 
Forward  Looking  Infrared 
Low  Altitude  Radar  Altimeter 
Tracking  Handle 
Electro-Optical  Sighting 
Visual  Sighting 
Fixtaking 

2.1.7  Display  Control 

Supports  the  presentation  of  data  to  the  pilot.  May  provide  for  any  or 
all  of  the  following  types  of  displays: 

Heads-Up  Display 
Navigation  Display 
Sensor  Display 
Data  Display 

2.1.8  Data  Entry/Retrieval 

Supports  the  entry  and  retrieval  of  data  by  the  pilot.  May  provide  any 
or  all  of  the  following  functions: 

Mission  Entry 
Aircrew  Panel  Control 
Data  Base  Access 

2.1.9  Communications 

Controls  all  voice,  digital,  and  video  communications  to  and  from  the 
aircraft . 


j 
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TABLE  6.4  (ConCd.) 
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2.1.10  Electronic  CounteraeasureB 

Controls  the  equlpnent  thet  reduces  the  effectiveness  of  eneny  equipment 
end  tsctlcs  employing  elcctromegnetlc  rsdletlon.  Msy  provide  for  eny  or  sll 
of  the  following  countemessure  techniques: 

Threst  Wsmlng 
Electronic  Warfare 
Penetration  Aids 


2.2  COMMUNICATIONS,  COtMAND,  AND  CONTROL  SOFTWARE 

Acquires  relevant  data,  processes  these  data,  presents  the  results  to  an 
operator  for  timely  declslon-maklng,  and  generates  appropriate  response  based 
on  the  decision. 

2.2.1  Data  Acquisition  ^ 

Controls  the  collection  and  Initial  processing  of  sensor  data  and  the 
transmission  of  these  data  to  the  main  processor.  Performs  the  following 
functions: 

Sensor  Control 
Signal  Processing 
Data  Transfer 

2.2.2  Data  Processing 

Receives,  Identifies,  reduces,  and  condjlnes  Input  data  for  display. 
Performs  the  following  functions: 

Mission  Control 
Data  Identification 
Data  Reduction 
Data  Manipulation 

2.2.3  Operator  Analysis  and  Decision  Making 

Forms  the  man-machine  Interface  and  provides  for  control  of  the  mission 
by  displaying,  monitoring,  and  accepting  data  at  operator  consoles.  Performs 
the  following  functions: 

Mission  Management 

Data  Display  and  Monitoring 

Operator  Data-Entry  and  Control 

2.2.4  Reaponse  Generation 

Controls  the  actions  of  the  system  In  response  to  human  decisions  or  as 
a result  of  the  automatic  assessment  of  a situation.  Performs  the  following 
functions: 


Communications  Switching 
Message  Processing 
Report  Generation 


I 
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TABLE  8.4  (Contd.) 


2.2.5  Data  Storage  and  Retrieval 

Coordinates  and  controls  the  on-line  storage,  retrieval,  and  display  of 
data  contained  In  the  system's  files,  tables  and  Indexes. 

3.  DATA  BASE 

Contains  all  files.  Indexes,  tables,  and  libraries  required  to  store 
data  to  be  used  In  the  operation  of  the  system,  as  well  as  the  software  required 
to  maintain  this  data  base. 

3.1  FILES 

Collections  of  related  records,  organized  to  meet  a specific  purpose, 
stored  on  magnetic  tape,  disk,  etc.,  or  In  some  cases,  directly  In  core  memory. 

3.1.1  On-Line  Updatable  Files 

Files  which  are  processed  on-line  by  console  operators  and  are  displayed 
and  updated  through  the  use  of  keyboard  data  entry. 

3.1.2  Internal  Files 

Files  which  are  used  Internally  by  the  application  software.  This  type 
of  file  Is  normally  a read-only  file  which  Is  transparent  to  the  user. 

3.1.3  System-Generated  Files 

Files  which  are  generated  by  the  system  as  a result  of  normal  operations 
(l.e.,  are  not  specifically  updated  by  console  operators).  Historical  data  are 
normally  captured  In  this  type  of  file. 

3.1.4  Remote  Data  Base  Files 

Files  which  are  utilized  by  the  system  and  the  operational  personnel,  but 
are  not  physically  located  with  the  system  and  are  not  directly  under  Its 
control . 

3.2  INDEXES 

Ordered  lists  of  references  to  the  contents  of  a larger  body  of  data,  such 
as  a file  or  record,  together  with  keys  or  reference  notations  for  Identifying, 
locating,  searching,  or  retrieving  the  contents. 

3.3  TABLES 

Organized  collections  of  data,  usually  arranged  In  an  array  In  which  each 
item  Is  uniquely  Identifiable  by  some  label  or  by  its  relative  position. 

3.4  Program  Support  Library 

An  organized  collection  of  data  associated  with  a program  or  a group  of 
programs. 

3.4.1  Program  Library 

A collection  of  proven  computer  programs,  routines,  and  subroutines  which 
can  be  combined  with  other  programs  or  Inserted  Into  them  by  various  methods  to 
solve  problems  or  parts  of  problems. 
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TABLE  8.4  (Contd.) 


3.4.2  Macro  Library 


A library  that  conalsts  of  sets  of  Instructions  which  are  represented 
by  a single  macro- Instruction.  Using  this  library,  a programmer  can  code 
software  such  that  the  computer  automatically  generates  the  appropriate  set 
of  Instructions  represented  by  each  coded  macro-instruction. 


MAINTENANCE  ROUTINES 


Supports  the  off-line  maintenance  of  the  system  data  base  (Initializa- 
tion routines,  data  entry  routines,  restructuring  routines,  updating  routines, 
formatting  routines,  etc.). 


Table  3.4  lists  CPCs  under  three  headings:  Support  Software, 
Applications  Software,  and  Data  Base.  The  section  on  applications 
software  c >ver8  two  areas  specifically  of  interest  in  this  contract: 

3 

avionics,  and  command,  control,  and  communications  (C  ) . A slightly 
different  approach  was  used  for  these  two  types  of  software.  The  C 

3 

software  is  divided  into  functions  which  are  common  to  all  C systems. 

To  the  extent  that  this  list  is  exhaustive,  there  should  be  no  need  for 
adding  items.  For  avionics,  however,  applications  are  mission -oriented 
and  are  more  akin  to  a shopping  list  from  which  CPCs  will  be  selected. 
This  list  is  likely  to  grow  with  technology. 

An  initial  version  of  the  CPC  list  was  based  on  (1)  previous 
experience  in  software  design;  (2)  review  and  analysis  of  the  Work 
Breakdown  Structure  (WBS)  presented  in  the  Request  for  Proposal  for 
this  study  contract;  (3)  review  and  analysis  of  various  other  existing 
WBSs;  (4)  a survey  of  literature  relating  to  the  development  of  WBSs; 
and  (5)  reference  to  literature  describing  the  essential  functions  of 
both  avionics  and  communications,  command,  and  control  systems.  Of 

particular  Importance  in  the  avionics  area  was  work  done  by  General 

39  40 

Dynamics  and  an  article  by  Lynn  Tralnor. 

Once  the  initial  version  has  been  developed,  entries  were  compared 
with  a list  of  CPCIs  from  recent  BSD  software  developments,  received 
from  the  BSD  project  officer  for  this  study.  The  review  consisted  of 
determining  whether  each  of  the  existing  CPCIs  could  be  categorized 
according  to  the  entries  in  the  CPC  list.  An  Iterative  process  ensued 
that  refined  the  initial  version  to  that  presented  in  Table  8.4. 

At  the  same  time,  Mr.  John  Glore  of  MITRE  was  preparing  a similar 

list  as  part  of  his  effort  in  developing  a Statement  of  Work  preparation 
41 

guidebook.  Mr.  Glore  has  concentrated  more  on  the  non-applications 
software.  His  efforts  were  combined  with  ours  in  preparing  Table  8.4. 
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The  CPC  list  Is  our  "best  effort"  to  identify  the  elements  of  a 
software  system.  However,  it  is  inevitable  that  additional  entries 
will  be  identified  as  the  list  is  reviewed  by  people  at  other  Program 
Offices  and  as  a result  of  the  development  of  new  projects. 

8.b  RESOURCES 

We  now  turn  to  the  third  dimension  of  software  cost  elements 
(Fig.  8.1),  namely,  the  resources.  These  resources  fall  into  the 
categories  shown  In  Table  8.5.  Some  of  these  resources  are  contractor- 
furnished  while  other  are  government-furnished;  the  first  three  elements 
of  contractor-furnished  resources  are  the  most  important  to  software 
and  will  be  covered  in  more  detail  in  the  reporting  system. 

Direct  Technical  Labor  and  Support  Personnel  is  the  most  signifi- 
cant category  of  the  three.  Man-hours,  as  well  as  costs,  should  be 

TABLE  8.5 

RESOURCE  CATEGORIES 


Contractor 

Direct  Technical  Labor  and  Support  Personnel 
Computer  Resources 

Software  Unique  Facilities  and  Equipment 
Overhead 

Support  Personnel 
Materials  Consumption 
Travel 
G&A  and  Fee 
Government 

Air  Force  Labor 
Military 
Civilian 

Technical  Assistance  Contractor  Labor 
Government-Furnished  Equipment  and  Facilities 


8-32 


w 


collected.  Important  subcategories  Include  Direct  Software  Development, 
Configuration  Control,  Project  Management,  Documentation,  and  Training. 
Details  are  discussed  In  Sec.  8.6. 

Computer  Resources,  the  next  most  Important  category.  Includes 
acquisition,  rental,  and  operation  of  the  computer  systems  used  In  soft- 
ware development  and  maintenance.  Specifics  are  given  In  Sec.  8.7. 

The  third  category.  Software  Unique  Facilities  and  Equipment, 
often  Is  Important  In  large  software  projects.  These  Include  a facility 
for  the  software  development  and  a facility  (perhaps  the  same)  for 
centralized  software  maintenance.  Specific  costs  Include  acquisition 
(construction)  or  rental.  Operation  of  the  facilities  should  also  be 
charged  to  this  category,  although  this  tends  to  be  an  overhead 
account  and  we  therefore  suggest  handling  It  as  a rental  charge.  This 
would  allow  accountability  for  parts  of  facilities. 

The  remaining  contractor  Items  are  very  difficult  to  assign 
directly  to  software.  They  Include  non- technical  support  personnel, 
material  consumption,  travel,  G&A,  and  fee.  These  Items  have  less  to 
do  with  software  Itself  than  with  the  way  the  contractor  does  business. 

A sufficiently  elaborate  accounting  system  could  keep  track  of  the 
number  of  support  personnel,  specific  travel,  consumption  of  paper, 
etc.  However,  we  do  not  believe  the  collection  of  this  type  of  infor- 
mation would  be  cost-effective;  requiring  Its  separation  and  collection 
Is  bound  to  lead  to  contractor  resistance. 

It  Is  our  recommendation  that  these  costs  be  accumulated  for  the 
contractor  portion  of  the  defense  system  as  a whole,  and  a burden  rate 
be  calculated.  This  burden  rate  could  be  compared  among  contractors. 

If  desired,  to  assure  competitiveness.  It  could  also  be  used  to  allo- 
cate indirect  costs  If  a total  software  cost  for  the  contractor  Is 
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desired.  Total  contractor  costs  for  a particular  software  development 
would  then  be  given  by: 

«T>1  <=0  * 

total  cost 

cost  of  direct  technical  labor  and  direct  support 
personnel 

cost  of  computer  resources 

cost  of  software  unique  facilities  and  equipment 
burden  rate  covering  overhead,  G&A,  and  fee. 

Government  resource  consumption  Is  also  an  Important  part  of  soft- 
ware resources.  The  role  of  the  government  manager  changes  during  the 
life  cycle  from  the  Program  Office  staff,  overseeing  the  software 
development,  to  Air  Force  Logistics  Command  system  manager,  providing 
or  overseeing  maintenance  of  the  software.  Costs  include  those  of 
military  and  civilian  government  employees,  and  Technical  Assistance 
Contractor(s) . The  reporting  of  these  costs  has  been  outside  the 
scope  of  our  study,  but  we  speculate  that  they  are  a large  and  there- 
fore Important  part  of  the  total  cost  of  software  development  and 
maintenance. 

Similarly,  it  Is  Important  to  record  government- furnished  equip- 
ment, facilities,  and  computer  hardware  as  resources  utilized  for  soft- 
ware development  and  maintenance.  This  Is  particularly  Important  when 
comparing  costs  of  software  developments.  Although  gathering  data  on 
this  cost  Is  beyond  our  scope,  we  feel  that  all  government- furnished 
software-related  Items  must  at  least  be  specified  In  any  software  cost 
reporting  system.  Thus,  we  recommend  a resource  category  for  govern- 
ment-furnished equipment  and  facilities. 


C - 

where 

C - 


S&E  ” 

R - 
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We  now  turn  to  definitions  of  the  two  main  cost  categories.  The 
elements  will  then  be  summarized  In  Sec.  8.8. 

I 

8.6  DIRECT  TECHNICAL  AND  SUPPORT  LABOR 

The  primary  resource  consumed  Is  direct  technical  and  support 
personnel  man-hours.  Basically,  the  cost  of  the  Item  Is  equal  to  the 
number  of  man-hours  consumed  times  cost  per  man-hour.  Of  course,  cost 
per  man-hour  varies  with  skill  categories  and  the  mix  of  skill  categories 
changes  over  the  life-cycle  phases.  Thus,  man-hour  data  Is  not  enough 
Information  to  arrive  at  a good  cost  estimate.  Therefore,  we  recommend 
that  both  man-hour  and  cost  data  be  captured  In  the  reporting  system. 

i 

The  Ideal  reporting  system  would  report  cost  and  man-hours  by  | 

skill  category  for  every  Item  In  the  contractor's  work  breakdown  | 

structure.  While  this  level  of  detail  la  technically  feasible  and  I 

1 

internally  exists  to  almost  this  level  for  some  contractors,  there  is  | 

little  or  no  chance  that  a contractor  would  allow  the  government  such  | 

visibility  Into  Its  operations.  The  excuse  would  be  that  It  costs  too  \ 

much,  and  Indeed  the  contractor  would  be  at  least  partly  right.  This  i 

level  would  certainly  not  be  cost-effective  for  the  government.  Further-  ’ 

more,  it  Is  not  really  needed  for  cost  cohtrol  or  for  future  cost 
estimation. 

In  this  section,  the  three  dimensions  of  the  reporting  system 
elements  are  consolidated  Into  one  set  of  reporting  elements,  at  a level 
sufficiently  detailed  to  give  adequate  visibility  Into  the  development 
process  for  cost  control  and  direct-labor  cost  estimation.  Both  cost 
and  man-hour  Information  are  required.  However,  reporting  on  a regular 
basis  to  skill  category  level,  which  has  been  suggested  by  some.  Is  not 
required.  It  Is  fat  too  much  detail  for  a regular  reporting  system,  and 
Information  on  skill  mix  can  be  obtained  from  contractor  estimates. 
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In  this  section,  then,  the  cost  reporting  system  elements  are 
first  defined  and  then  related  to  typical  software  activities  In  the 
different  life-cycle  phases.  Cost  and  man-hour  data  are  to  be  reported 
on  a regular  basis  against  this  structure.  Supplemental  Information, 
based  on  contractor  estimates,  is  then  defined.  These  estimates  are 
made  at  key  milestones  and  Include  Information  on  skill-level  cate- 
gories. Standard  skill  levels  are  also  defined. 

8.6.1  Suggested  Reporting  System  Elements 

Suggested  reporting  system  elements  for  man-hours  and  costs  are 
shown  In  Table  8.6.  They  are  broken  down  by  level  of  end-item  report- 
ing (CPC  level,  CPCI  level,  and  total  defense  system  or  contract^ 
level).  All  data  are  for  direct  technical  labor,  with  the  exception 
of  support  personnel  such  as  managers,  secretaries,  and  key  punchers 
who  are  totally  occupied  by  contract  software  tasks.  Support  specifi- 
cally does  not  Include  personnel  required  to  run  the  computer;  these 
are  Included  In  computer  resources  (Sec.  8.7). 

Of  utmost  Importance  to  future  cost  estimation  Is  the  collection 
of  resource  requirements  using  a standard  set  of  end  items,  which  we 
have  called  CPCs.  These  were  defined  in  Sec.  8.4.  However,  this  goal 
should  not  require  artificial  allocations  of  time  to  these  categories. 
Accordingly,  we  recommend  that  only  the  following  activities  be 
recorded  to  the  CPC  level,  and  Identified  with  the  activity  label 
specified:  (1)  Detailed  design  of  the  CPC  functions  should  be 

recorded  against  Design,  while  coding  and  Initial  checkout  of  the  CPCs 
should  be  assigned  to  Coding;  (2)  redesign  and  recoding  of  a CPC  In 
response  to  test  error  detection  should  be  recorded  under  Integration;** 
* 

Reporting  at  different  levels  within  the  same  extended  work  breakdown 
structure  was  an  cxpresaed  concern.  However,  in  conversation  with 
Greg  Maust,  ESD  Comptroller,  It  became  clear  that  there  Is  no  restric- 
tion on  reporting  at  different  WBS  levels  in  the  Cost  Performance 
Report. 

** 

Note  that  no  attempt  should  be  made  to  separate  redesign  and  recoding. 


TABLE  8.6 


REPORTING  SYSTEM  ELEMENTS 


Activity 

Design 

Coding 

Integration 

Installation 

Maintenance 

Design 


Definition 


CPC  Level;  All  man-hours  for  particular  activities 
that  can  be  Identified  to  the  CPC  level 

Detailed  CPC  design 

Coding  and  Initial  checkout  of  CPCs 

Redesign  and  recoding  In  response  to  test  error 
detection 

Modification  of  code  for  site-specific  application 

Coding  and  Integration  In  response  to  error  reports 
(DRFs) . 

CPC I Level;  All  resources  for  particular  activities 
related  to  the  CPCI  and  not  assignable  to  CPCs. 

CPCI  design.  Interfaces,  and  allocation  of  functions 
to  modules 


Testing 


Defining  and  carrying  out  CPCI-level  software  tests. 


Independent  V&V 


Analysis 


Testing 


* 


Independent  V&V 


Independent  verification  and  validation  of  CPCIs 

Contract  or  System  Level;  All  resources  for  particular 
activities  related  to  the  contract  or  system  and  not 
assignable  to  CPCs  or  CPCIs 

Studies  to  resolve  conceptual  problems  and  demonstrate 
capability  to  meet  requirements,  including  algorithm 
development,  allocation  of  functions  to  CPCIs,  etc. 

Testing  of  software  functions  at  system  level,  including 
hardware  interface 

Independent  verification  and  validation  at  system 
level 
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TABLE  8.6  (Continued) 

Activity  Definition 

Management 

Engineering  All  management  personnel  assigned  to  oversee  the 

technical  quality  of  the  software  product,  including 
directing  software  development,  directing  the  prepara- 
tion and  review  of  software  portions  of  technical 
documents,  preparation  for  and  attendance  at  technical 
review  meetings,  on-site  support  during  operations, 
fault  isolation,  etc. 

Configuration  All  management  personnel  assigned  to  oversee  the 

maintenance  of  baselined  software  and  software  speci- 
fications and  to  document  and  incorporate  approved 
changes  to  the  baseline. 

it 

Project  Other  management  activities  directly  associated  with 

the  software  product,  including  contract  management, 
cost  and  schedule  management,  business  and  adminis- 
tration planning,  directing  and  controlling  the  project 

Training  Preparation  of  course  material,  demonstration  and 

conduct  of  training  courses  if  contracted,  etc. 

Documentation  Writing,  editing,  publishing,  reproduction,  and 

dissemination  of  software  documents  required  by  the 
CDRL  (could  be  reported  at  CPCl  level  if  more  detail 
is  needed) . 

Technical  Part  1 and  Part  II  Specifications,  Test  Plans,  User's 

Manuals,  specified  technical  reports,  etc. 

Configuration  Version  descriptions,  configuration  index,  change 
status  reports,  specification  change  notices,  etc. 

Program  ^ Software  development  plan,  cost  performance  reports. 

Management  program  schedules,  program  milestones,  etc. 

* 

Support  Non-technical  personnel  totally  assigned  to  the  soft- 

ware portion  of  the  contract  such  as  key  punchers, 
secretaries,  etc. 

_ 

These  items  are  sometimes  difficult  to  separate  from  hardware. 


8-38 


and  (3)  modification  of  CPC  code  for  specific  site  requirements  should 
be  reported  under  Installation,  while  coding  and  testing  of  corrections 
for  bugs  discovered  during  operation  should  be  reported  under 
Maintenance. 

Direct  labor  at  the  CPCI  reporting  level  includes  those  activities 
that  cannot  easily  be  separated  into  CFCs  but  can  be  tracked  naturally 
to  the  CPCI  level.  This  Includes  design  activities  concerned  with  the 
allocation  of  functions  to  CPCs  and  modules,  as  well  as  interfaces 
between  CPCs.  Defining  and  conducting  CPCI-level  tests  in  response  to 
Part  I specification  requirements  should  be  reported  against  Testing 
at  this  level.  Also,  any  Independent  V&V  activities  which  are  related 
to  specific  CPCIs  should  be  identified  at  this  level. 

Direct  labor  reported  at  the  system  (or  contract)  level  can  be 
divided  into  product  and  support  activities.  Product  activities  Include 
all  the  special  studies  performed  under  Analysis,  including  functional 
allocation  to  CPCIs.  System- level  testing  and  Independent  V&V 
activities  relating  to  these  tests  should  also  be  reported  at  this  level. 

The  support  tasks  Include  management,  training,  and  documentation. 
They  should  also  be  reported  at  the  system  level  with  management 
divided  into  engineering,  configuration,  and  project.  Documentation  is 
divided  into  technical,  configuration,  and  program  management. 

The  above  elements  are  for  technical  and  management  man-hours 
only.  Other  support  man-hours  are  reported  in  aggregate  against  the 
Support  element  at  the  bottom  of  Table  8.6.  Note  that  at  the  system 
level  it  will  be,  difficult  to  separate  software  from  hardware  or  system 
costs.  As  a result  it  may  be  advisable  to  report  some  of  these  costs 
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system-wide  and  not  try  to  allocate  them  to  software.  For  completeness, 
we  have  Included  them  In  Table  8.6  because  they  would  be  present  in  a 
software-only  development.  Therefore,  for  comparability  they  should  ac 
least  be  estimated.  More  about  this  In  Sec.  8.8  when  the  proposed 
elements  are  correlated  with  an  existing  reporting  system. 

8.6.2  Relationship  of  Reporting  Elements  to  Life-Cycle  Phases 

These  elements  were  selected  after  considering  each  of  the  software 
activities  under  each  of  the  life-cycle  phases  and  trying  to  preserve  as 
much  Information  as  possible  without  requiring  too  much  detail  or  manual 
allocations.  In  this  subsection,  the  cost  elements  are  further  defined 
by  typical  activities  occurlng  In  the  life-cycle  phases  (Identified  In 
Sec.  8.2).  Tables  8.7  through  8.13  map  the  tasks  typically  performed  in 
the  phases  Into  the  proposed  reporting  elements.  A summary  is  shown  In 
Table  8.14.  Note  that  data  Is  typically  available  through  monthly 
reports  by  phase.  The  tables  show  how  the  data  Is  to  be  assigned  to 
the  reporting  elements. 

The  reader  is  again  reminded  that  some  of  the  reporting  elements 
bear  the  same  name  as  the  life-cycle  phases.  In  effect.  Table  8.14  Is 
a mapping  of  life-cycle  phase  names  Into  activity  definitions  (report- 
ing elements)  using  the  same  terms.  Also  note  that  we  In  effect  will 
have  both  a milestone  and  an  activity  reporting  system  (columns  vs  rows). 
By  having  both,  and  being  able  to  track  between  the  two  (through 
Table  8.14),  the  advantages  of  both  methods  will  be  attained.  This  was 
done  to  show  where  difficulties  lie  between  the  activity  and  milestone 
Interpretation  of  these  terms.  Where  activities  cannot  be  separated 
into  the  classic  definitions,  new  terms  are  defined.  Thus,  Integration 
Includes  redesign  and  recoding  tasks  which  are  Inseparable  In  the 
Integration  and  testing  life-cycle  phases. 
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TABLE  8.7 

REPORTING  ELEMENTS:  ANALYSIS  PHASE  (Milestone  Definition) 
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Support  As  in  Table  8.6 


REPORTING  ELEMENTS:  DESIGN  PHASE  (Milestone  Definition) 

(PDR  to  CDR) 
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Project  As  in  Table  8.6 

Training  Training  Plan  Training  plan  specification. 


TABLE  8.8  (Continued) 
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REPORTING  ELEMENTS:  INTERNAL  TEST  AND  INTEGRATION  PHASE  (Milestone  Definition) 
(Source  Baseline  to  Start  of  PQT) 
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Could  be  reported  to  CPCI  level  if  desired. 
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TABLE  8.12 

REPORTING  ELEMENTS:  INSTALLATION  PHASE  (Milestone  Definition) 

(PCA  to  lOT&E) 
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Support  As  in  Table  8.6 


REPORTING  ELEMENTS:  OPERATIONS  AND  SUPPORT  PHASE  (Milestone  Definition) 
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Can  be  reported  to  CPCI  level  If  more  visibility  desired. 


TABLE  8.14 

REPORTING  SYSTEM  ELEMENTS  AND  LIFE-CYCLE  PHASES 
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I 

i Note  the  provision  for  early  starts  and  late  finishes  In  coding, 

design,  and  analysis  reporting  elements.  This  Is  provided  because  It  Is 

f so  often  the  practice.  We  do  believe  that  this  overlap  should  be  limited 

1 ' 

i to  specific  CPCs  with  authorization  at  contract  award,  PDR,  or  CDR. 

Authorization  should  be  accompanied  with  supporting  rationale.  This  should 
aid  In  control,  as  well  as  ultimately  facilitating  an  assessment  of  the 
cost  Impacts  of  planned  overlapping. 

Note  also  that  some  activities  are  spread  throughout  the  life-cycle 
phases.  This  Includes  testing,  V&V,  and  training.  Life-cycle  (milestone) 
definitions  of  the  term  "test"  are  obviously  misleading. 

The  activities  listed  under  each  life-cycle  phase  were  compiled 
after  reviewing  about  six  WBSs  for  NASA  and  BSD  software  projects.  Oper- 
I atlons  and  maintenance  is  an  exception.  It  Is  based  upon  the  SDC  report- 

ing system  to  SAMSO  as  Computer  Program  Integration  Contractor  for  the 
Satellite  Control  Facility  (SCF).  For  documentation  requirements,  we 
referred  to  AFM  800-14.^^  Some  special  comments  are  listed  below. 

The  analysis  performed  by  the  development  contractor  Is  usually 
not  all  the  analysis  performed  for  the  software  development.  A consider- 
able amount  can  already  have  been  spent  during  the  concept  formulation 
and  validation  phases  of  the  system  acquisition  life  cycle  (Fig.  2.1)  to 
prove  out  concepts  or  build  prototypes.  Problems  of  data  comparability 
are  thus  Introduced.  There  will  be  more  about  this  topic  in  Sec.  8.10. 

V&V  has  been  separately  Identified.  Ordinarily,  this  will  be 
done  by  a separate  contractor,  which  simplifies  the  reporting.  Qualifi- 
cation testing  during  maintenance  and  Installation  Is  shown  as  V&V,  since 
It  Is  typically  done  on-slte,  or  at  least  separately  from  the  development 
group.  This  Is  not  the  case  for  the  Formal  Qualification  Test  of  the 
original  product.  In  which  the  developing  contractor  may  run  the 
demonstration. 


1 

1 

i 
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Note  also  that  Intetnal  testing  during  Installation  could  be  shown 
at  the  CPCI  level,  but  this  level  of  detail  does  not  add  much  Informa- 
tion and  we  do  not  suggest  It. 

Finally,  operations  and  maintenance  has  a peculiar  problem.  A 
major  activity — product  Improvement  through  ECPs — Is  not  Included  because 
we  feel  this  should  be  shown  as  a new  development  and  not  a maintenance 
activity.  The  reason  for  this  split  Is  that  product  Improvement  should 
have  no  relation  to  errors  In  the  original  product. 

Unfortunately,  It  will  be  somewhat  arbitrary  to  separate  management, 
documentation,  and  support  hours  between  these  two  major  activities  (ECP 
development  and  error  correction).  In  analyzing  the  costs  of  ECFs  (or 
error  correction)  some  method  for  allocation  of  these  support  costs  will 
have  to  be  used  for  comparability. 

Also  note  that  a great  deal  of  the  analysis  will  have  been  com- 
pleted before  the  ECP  Is  accepted.  Developments  (ECPs)  during  the  main- 
tenance phase.  In  essence,  are  lacking  a recorded  Analysis  phase. 

Analysis  Is  performed  as  special  studies.  Also,  It  Is  common  practice 
not  to  repair  low-priority  errors  during  qualification  testing  of  the 
ECP  development.  These  are  merely  reported  as  errors  (DRFs)  and 
repaired  when  there  Is  time.  Hence,  care  will  have  to  be  taken  In 
comparing  ECP  developments  among  themselves  or  especially  with  new 
software  developments.  More  Is  said  on  this  topic  In  Sec.  8.9. 

8.6.3  Frequency  of  Reporting  and  Special  Reports 

As  stated  earlier,  cost  and  man-hour  data  should  be  reported 
against  the  proposed  elements  periodically  (say  monthly).  From  this 
data,  cost  per  man-hour  can  be  calculated  by  life-cycle  phase  or  by 
resource  element.  Furthermore,  a comparison  with  progress  towards 
product  completion  (Sec.  8.3)  gives  a good  Indication  of  whether  a pro- 
ject 1^  on  target.  For  example,  what  percentage  of  the  coding  dollars  ] 

i 

1 
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and  nan-hours  have  been  consumed  and  how  does  this  compare  to  the  per- 
centage of  completed  code?  This  comparison  enhances  the  Program  Office's 
ability  to  control  the  development. 


Adding  a requirement  for  the  contractor  to  report  updated  estimates 
of  man-hours  and  costs  at  key  milestones  will  provide  greater  Insight. 

Furthermore,  this  Information  can  be  required  by  skill  category.  There- 
fore, we  recommend  that  estimates  of  man-hours  by  skill  category  be 

required  for  each  reporting  element  and  life-cycle  phase.  The  method 

a 

of  estimation  should  also  be  shown.  Total  direct  labor  costs  (not  by 
skill  category)  should  also  be  reported  so  that  an  actual  cost  per 
man-hour  can  be  derived  for  each  combination  of  reporting  element  and 
life-cycle  phase. 

This  estimate  should  be  made  for  all  development  phases  (through 
PCA)  at  contract  award.  Since  CPCs  and  CPCIs  will  In  general  not  have 
been  defined  at  contract  award,  estimates  will  be  aggregared  to  the 
system  level.  At  PDR,  these  estimates  can  be  updated  for  the  remaining  w' 

software  development  phases.  Visibility  to  the  CPC  level  will  then  be 
possible  and  should  be  required.  These  estimates  should  again  be  updated 
at  CDR,  to  the  CPC  level. 

Since  dollar-per-man-hour  factors  may  be  calculated  from  these 
estimates,  the  Program  Office  will  be  able  to  compare  these  to  actuals 
to  see  If  the  factors  are  changing  significantly.  If  they  are  signifi- 
cantly higher  than  planned,  troubles  with  particular  CPCs  may  be  occur- 
ring which  require  higher-priced  talent  to  fix.  Hence  the  higher  cost. 

If  the  actuals  are  significantly  lower  than  estimates,  required  progress 
may  not  be  taking  place  or  the  CPC  may  be  coming  In  ahead  of  time. 


This  recommendation  has  also  been  made  by  Captain  Devenny  (Ref.  1,  pg.  88). 
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Hence,  Che  Program  Office  will  have  Indications  of  anomalies,  both  good 
and  bad,  long  before  they  are  officially  reported. 


I 


i 

I 


! 

i 


1 

I 
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Requiring  these  detailed  estimates  will  also  build  up  a Government 
data  base  of  factors  upon  which  to  convert  man-hours,  by  reporting 
elements,  to  dollars.  Hence,  It  will  aid  In  future  cost  estimation. 

Also,  this  Information  can  be  used  to  evaluate  contractor  proposals 
(does  he  have  the  right  mix  of  talent?,  are  they  reasonably  priced?, 
etc. ) . 

8.6.4  Typical  Skill  Category  Definitions 

A standardized  list  of  skill  categories  is  required  If  the  contrac- 
tor estimates  are  Co  have  all  Che  benefits  mentioned  above.  We  have 
developed  the  following  tentative  list  of  skill  categories  chat  will  be 

A 

required  during  various  phases  of  the  system  life  cycle: 

Management  personnel 
Scientists  and  engineers 
Systems  analysts 
Systems  programmers 
Applications  programmers 
Program  llbrarlan/secretarles 
Data  entry  operators 

Excluded  from  this  list  are  computer  operators,  who  are  Included 
under  computer  resources  (Sec.  8.7),  and  support  personnel  who  are  not 
assigned  full-time  to  the  software  development. 


*Sklll  category  material  was  derived  principally  from  Hansen's  Weber 
Salary  Survey  on  Data  Processing  Positions,  as  well  as  various  other 
related  salary  surveys. 
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Each  of  these  categories,  except  management  personnel,  may  be  ^ 

further  classified  according  to  level  of  skill: 

Senior  Level:  Usually  competent  to  work  at  the  highest 
technical  level  of  all  phases  of  a project.  Supervises 
lower- level  personnel  to  provide  technical  guidance. 

Level  A:  Able  to  work  under  general  supervision  on  most 
phases  of  a project  but  requires  some  technical  guidance 
on  other  phases. 

Level  B:  Works  under  direct  supervision  on  several  phases 
of  a project  but  requires  technical  guidance  and  Instruction 
on  most  other  phases. 

Level  C:  Works  under  Immediate  supervision  on  Individual 
tasks,  with  the  work  being  carefully  checked. 

Appendix  C presents  Job  descriptions  for  each  of  the  skill  cate- 
gories, and  current  salary  Information. 

8.7  COMPUTER  RESOURCES 

There  are  three  primary  methods  of  using  computer  resources  In 
software  development  and  maintenance.  First,  and  most  common  for  large 
software  developments.  Is  a completely  dedicated  system,  often  utilizing 
equipment  which  will  become  part  of  the  operational  system,  or  Is  a 
duplicate  of  the  operational  system  (a  useful  option  If  software  Is  to 
be  centrally  maintained) . The  Defense  Support  Program-CONUS  Ground 
Station  (DSP-CGS)  software  development  by  IBM  is  an  example  of  the 
former,  and  the  Satellite  Control  Facility  (^CF)  software  development 
by  SDC  is  an  example  of  the  latter.  (Both  of  these  software  develop- 
ments are  part  of  SAMSO.) 

A second  method,  utilized  In  smaller  programs.  Is  to  develop  and 
maintain  the  software  on  a dedicated  system  which  Is  not  necassarlly 
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the  same  as  the  target  system.  In  this  case,  more  general  programming 
techniques  are  used  so  that  conversion  to  the  target  system  Is 
simplified. 

A third  method,  primarily  used  during  debugging  when  I/O  Is  limited. 
Is  to  use  a base  computer  which  Is  shared  (e.g.,  time  sharing)  with  other 
projects. 

In  any  case,  costs  can  be  divided  Into  Investment  and  operations. 
Investment  costs  Include  acquisition.  Installation,  and  rental  charges 
for  all  contractor-furnished  equipment.  Investment  costs  can  also 
be  Included  In  the  service  charge  for  a computer  center  (non-dedlcated 
equipment)  or  a time-share  service.  Often,  Government  Furnished  Equip- 
ment (GFE)  Is  used  and  will  not  appear  as  a cost  to  the  contractor  (or 
to  the  Government  In  most  reporting  systems).  Care  must  therefore  be 
taken  when  comparing  computer  costs  to  Identify  just  what  Is  GFE  and 
what  Is  not. 

Operating  costs  are  charged  to  a separate  account,  often  a separate 
contract,  for  a dedicated  system  and  are  Included  In  a user  rate  for 
shared  systems.  For  the  latter,  cost  Is  directly  a function  of  utiliza- 
tion, while  for  the  former.  It  Is  a function  of  system  availability. 

Of  course,  availability  will  be  based  on  utilization.  Therefore,  It  Is 
Important  (for  comparisons)  that  systems  be  properly  sized.  Too  little 
capability  may  lead  to  smaller  direct  costs  at  the  price  of  poor  turn- 
around time  which.  In  the  long  run,  may  be  more  expensive. 

It  Is  Important  to  note  that  utilization  Is  far  from  static. 

Many  have  reported  Increased  utilization  during  development  as  shown  In 
Fig.  8.3.  This  pattern  also  holds  for  EOF  development  during  maintenance, 
as  Table  8.15  Illustrates  for  the  two  SCF  associate  contractors. 


TABLE  8.15 

COMPUTER  HOURS  FOR  MAINTENANCE 
(Data  on  Satellite  Control  Facility  Programs) 


Ratio  of  Computer-Hours  to  Technical  Man-Hours 


Contractor  A 


Contractor  B 


Month 

ECP 

Maintenance 

ECP 

Maintenance 

1974 

Oct . 

.09 

.04 

.11 

.09 

Nov. 

.07 

.10 

.06 

.06 

Dec . 

.10 

.02 

.09 

.07 

1975 

Jan. 

.08 

.02 

.16 

.10 

Feb . 

.16 

.14 

.14 

.06 

Mar . 

.24 

.21 

.12 

.04 

Version  15.0C*  FQT 

Apr. 

.30 

.08 

.30 

.03  . 

AS 

May 

.25 

.07 

.33 

.02  1 

Version  15.1 
, Test  and 

Jun . 

.19 

.11 

.47 

.03  j 

Integration  | Version 

July 

.26 

.13 

.76 

.04  1 

1 FQT 

Aug. 

.12 

.09 

.43 

.03 

* 

Version 

15. OC 

contained  small 

changes 

which  were 

predominantly  DRFs. 

** 

Version 

15.1 

Introduced  major 

new  capability. 
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During  the  period  depicted  In  Table  8.15,  two  revisions  were 
completed  and  Installed.  Version  15. OC  went  through  FQT  In  March  1975. 
This  was  a small  development  with  a lot  of  normal  maintenance  DRFs. 
Version  15.1,  on  the  other  hand,  was  a major  new  capability.  It  was  In 
Integration  and  test  from  April  1975  to  July  1975. 

Note  that  for  both  contractors,  computer  utilization  for  ECP 
development  rose  during  Version  15.1  FQT.  Although  Contractor  B 
showed  little  change  for  Version  15. OC  FQT,  Contractor  A showed  a sig- 
nificant change.  Note  also  that  utilization  was  at  a fairly  uniform 
f ) level  during  maintenance.  The  exception  to  this  was  March  1975,  but 

this  too  was  connected  with  the  maintenance  actions  Incorporated  with 
I the  Version  15. OC  FQT. 

Although  no  cost  estimating  relationships  have  been  derived  from 
this  limited  sampling  of  the  SCF  data,  the  data  would  support  a model 
which  could  estimate  computer  utilization  from  technical  man-hours. 
Costs  could  then  be  derived  from  this  utilization  based  on  type  of 
operational  availability,  type  of  system,  and  amount  of  GFE. 

Data  collection  that  would  be  necessary  to  support  any  future 
development  of  a cost  model  Includes  a system  description,  cost  data 
(Including  Investment  and  operations),  utilization  data,  and  avalla- 
I blllty  data.  Each  Is  discussed  below. 


8.7.1  Computer  System  Description 

A complete  description  of  the  computer  system  should  be  recorded. 
Table  8.16  Is  a list  of  Items  that  should  be  considered  for  a batch 
operation.  Typical  software  Items  purchased  with  the  system  are  also 
shown  In  Table  8.17.  The  description  should  record  which  Items  are 
Included,  how  many,  and  their  cost  or  rental  charge.  GFE  should  be 
designated.  If  the  equipment  will  be  a part  of  the  operational  system 
(a  non-software  cost),  this  should  also  be  stated.  Finally,  a 
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Hardware  Item 
MAINFRAME: 

CPU 

Main  Storage 

Additional  Core  Storage 

4 • 

Controllers 
Cabinets 
Cabling 
I/O  Channels 
PERIPHERALS: 

Operators'  Console 
Printer 


Card  Reader,  Punch 

Disk  Pack: 

• Head /Track  and 
Moving  Head 


• Cassette 

• Floppy 


Typical  Descriptive  Parameters 


Machine  cycle  time,  add  time,  no.  CPUs, 
no.  registers,  register  size,  no.  buffers, 
buffer  length,  Instrpctlon  repertoire 

Storage  type,  read  cycle  time,  write  cycle  time, 
bytes  fetched/cycle,  qilnlmum  capacity,  maximum 
capacity,  increment  size,  word  size 

Type,  total  bytes,  data  transfer  rate,  cycle 
time,  no.  access  channels 

•No.  dev Ices /controller,  data  transfer  rate 

Dimensions,  use 

Length,  bit  paths 

bata  transfer  rate,  no.  channels,  channel  type 


CRT  or  hard  copy,  functions  available 

Print  positions,  lines  per  minute,  no. 
character  positions,  lines  per  Inch,  skip 
speed,  type  of  printing  (drum,  train, 
chain,  other),  paper  width 

Cards/mlnute  read  and  punched,  posltions/sec. 
printed,  card  size,  card  hopper  capacity,  card 
thickness,  card  codes,  reading  method 


Fixed  or  removable,  fixed  or  moving  head, 
average  access  time,  average  data  transfer  rate, 
bytes  of  storage,  no.  drives  per  pack,  no. 
drives  per  controller,  average  rotational 
delay,  no.  channels 

No.  tracks.  Includes /sec. , blts/lnch, 
recording  mdde,  data  transfer  rate 

No.  tracks,  sectorp/track,  bytes/sector , no. 
drives,  hopper  capacity,  records/mln.  read, 

RPM 
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TABLE  8.16  (Continued) 


1 


^ I 


COMPUTER  HARDWARE 


Hardware  Itai 
Drum 

Tape  Drive 

Paper  Tape  Reader 
Paper  Tape  Punch 
Optical  Scanner 

Plotter 

Terminal : 
e CRT 

• Teletype 
e Telephone 


ANCILLARY  EQUIPMENT; 
Key  Punch: 
e To  Cards 


e To  Tape 
or  Disk 

Card  Sorter 

Interpreter 
Reproducer 
Tape  Cleaner 
Degausser 


Typical  Descriptive  Parameters 

Average  access  time,  no.  tracks,  data  transfer 
rate,  bit  density,  cycle  time 

No.  tracks,  Inches/sec.,  blts/sec.,  recording 
mode,  no.  drives  per  controller,  no.  channels, 
rewind  speed,  no.  drlves/cablnet 

No.  tracks,  characters/sec.,  reel  size 

No.  tracks,  characters/sec.,  reel  size 

Encoding  type,  document  width,  documents/ 
minute,  no.  pockets,  fonts  read,  document 
length,  document  thickness 

Table  or  cylindrical,  size,  plot  records/ 
sec.,  pen  type,  dimensions 


No.  characters,  keyboard  type,  printer 
type,  screen  size,  no.  lines,  no.  characters/ 
line,  expansion  feature,  no.  keys 

Data  transfer  rate,  character  set  capability, 
forms  feeder  capability 

Size  and  weight,  video  capability,  hard 

copy  capability,  print  speed,  TTY  compatibility, 

telephone  line  discipline,  data  transfer  rate 


Card  columns,  character  set,  drum  card 
capability,  print  capability,  duplication 
capability.  Interpreting  capability 

Character  set,  format  types,  record  length 
variability,  programming  functions 

Cards/min.,  character  set,  hoppers  (no.  and 
size) 

Cards/mln.,  card  size 
Cards/mln.  , functions 
Reel  size,  speed,  tape  size 
Tape  size , degaussing  speed 


I 


I 

I 

I 
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TABLE  8.17 
COMPUTER  SOFTWARE 


i 

t 

[ 


t 


I 


OPERATING  SYSTEM: 

I/O  Supervisor 

Permanent  File  Management 

CPU  Scheduling/Memory  Management 

Real-Time  Support 

PROGRAM  DEVELOPMENT: 

Source  Text  Editor 

Assembler 

Compilers 

Linkage  Editor/Loader 

CHECKOUT  AND  TEST: 

V&V  Tools: 

• Debugging  Tools 

• Code  Analyzers 

• Test  Case  Generators 

• Specification  Language 
Configuration  Management 

Test  Bed: 

• Simulation  System 

• Simulation  Environment 

DOCUMENTATION: 

Flow  Charter 

Document  Maintenance  and  Generation 


1 
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description  of  the  operating  environment  should  be  given:  dedicated, 
shared,  or  time-shared.  For  the  latter  two,  the  service  charge  per 
computer  hour  should  be  recorded. 

8.7.2  Computer  Resource  Costs 

Although  detailed  Information  Is  required  for  direct  labor,  com- 
puter cost  data  can  be  more  aggregated.  For  a dedicated  system,  this 
Is  the  only  logical  way  to  keep  the  books.  Even  for  shared  systems, 
costs  can  be  allocated  on  the  basis  of  total  system  utilization.  Thus, 
cost  will  depend  on  other  users. 

For  each  system  In  question,  we  recommend  that  the  acquisition 
(Investment)  cost  (or  rental)  be  recorded  and  that  total  operating  costs 
for  providing  the  service  be  reported  monthly.  Operating  costs  Include 
the  cost  of  computer  center  operations  personnel,  user  charges  (excluding 
Investment),  equipment  maintenance,  etc.  Facilities  maintenance  should 
be  reported  separately. 

8.7.3  System  Utilization 

System  utilization  Information  should  be  divided  Into  two  parts: 

(1)  that  for  the  subject  project  and  (2)  that  for  the  total  computer 
system,  Including  other  clients.  If  shared.  The  latter  can  be  reported 
at  a system-wide  level.  In  terms  of  hours  per  week,  just  as  Investment 
cost  was  reported  at  that  level  (Sec.  8.7.2).  No  special  requirements 
are  necessary  for  time-share  reporting. 

For  the  project  In  question,  computer-hour  Information  at  a more 
disaggregated  level  should  be  collected  to  see  If  the  patteriw  of 
computer-hour  per  technical  man-hour  are  different  for  each  CPC.  Thus, 
we  wish  to  track  computer-hours  by  CPC  for  activities  that  take  direct 
labor  hours  at  that  level.  This  Is  also  true  for  each  CPCI  and  for  the 
system  as  a whole.  There  should  be  little  difficulty  In  maintaining 
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computer  hours  in  this  manner.  The  direct  labor  account  numbers  will 
suffice. 

Reporting  on  a monthly  basis  will  also  allow  one  to  relate  computer 
hours  to  activity  within  a CPC  should  more  disaggregation  of  computer- 
hour  data  be  deemed  necessary. 

8.7.4  System  Availability 

System  availability  is  a little  more  difficult  to  measure.  Two 
types  of  measures  are  desired:  one  for  availability  and  one  to  see  if 
availability  is  suitable.  Availability  can  be  measured  if  a log  is 
kept  of  hours  per  week  during  which  the  system  is  available.  Whether 
this  meets  actual  needs  is  another  question.  For  batch  operations, 
turnaround  time  is  a good  measure  (where  turnaround  time  is  defined  as 
time  of  log  out  minus  time  of  log  in  minus  non-queued  execution  time). 

Both  average  and  95th  percentile  should  be  reported  monthly. 

For  time-share  systems,  the  same  type  of  measure  would  be  desirable. 
However,  this  should  include  such  things  as  waiting  time  in  line,  etc. 

A better  measure  would  be  a ratio  of  total  terminal  time  divided  by 
cumulative  core  execution  time.  I/O  time  and  queue  time  would  constitute 
the  difference.  Again,  average  and  95th  percentile  would  be  required. 

Note  that  all  measures  suggested  are  Independent  of  program  size 
and  nominal  execution  time. 

8.8  FEASIBILITY  OF  REPORTING  SYSTEM 

The  reporting  system  has  now  been  defined  and  it  is  useful  to  ask 
how  practical  the  definition  has  been.  First,  it  would  help  to  summarize 
the  data  required. 

Contractor's  resource  consumption  data  is  sunawrlzed  in  Table 
8.18  and  computer  resource  description  and  utilization  data  is  shown 
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TABLE  8.18 

CONTRACTOR  RESOURCE  CONSUMPTION  DATA 


In  each  category,  estlaated  costa  are  to  be  reported  once,  at  the  begin- 
ning of  the  contract,  except  as  noted.  Actual  costs  are  to  be  reported 
aonthly,  except  as  noted. 


Notes 
1.  2,  3 

1 

4 


4.  5 
6 


NOTES: 

1.  Disaggregated  by  product'  component  as  In  Table  8.14. 

2.  Estimates  disaggregated  by  skill  category  as  In  Sec.  8.6.4. 

3.  Estimates  revised  at  PDR  and  at  CDR. 

4.  Actuals  reported  when  Incurred. 

5.  Disaggregated  by  type  of  facility. 

6.  Disaggregated  by  type  of  facility  or  equipment. 


Direct  Labor  (in  dollars  and  In  man-hours) 

Computer  Resources 

Computer  Hours 

Investment 

Operations 

Unique  Facilities  and  Equipment 

Investment 

Operations 

Overhead  Rate 
G&A  and  Fee  Rates 
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t ^ ... 


in  Table  8.19.  Software  product  description  and  status  data  were  pre- 
viously sumarlsed  in  Table  8.1. 


I 


The  first  thing  to  note  is  that  the  frequency  of  reporting  is 
different  for  the  different  data  items.  One-time  reports  or  those 
produced  at  specific  milestones,  though  detailed,  should  not  be  viewed 
as  a hardship  by  the  contractor  or  the  Government.  They  are  generally 
descriptive  or  represent  back-up  data  for  cost  estimates  which  should 
have  been  developed  for  a credible  bid. 


Some  examples  may  prove  helpful.  For  the  software  product  des- 
cription and  status,  the  following  milestone-related  reports  are 
required  (Table  8.1): 

Computer  Program  Development  Plan  at  Contract  Award 

CPC  Definition  and  CPCI  Update  at  PDR 

CPC  Update  at  CDR 

Module  Definition  at  CDR 

Start  Qualification  Test  at  PQT 

Completed  Product  at  PCA 

Installation  at  lOT&E 

The  first  three  are  estimates  and  represent  a detailed  description  of 
the  product  to  be  developed.  Details  are  increased  as  the  design  is 
completed  (CDR)  and  actuals  are  recorded  against  the  estimates  at  Start 
of  Qualification  Test  and  PCA.  The  report  submitted  at  PCA  is  a complete 
description  of  the  product  (Table  8.2),  including  descriptive  informa- 
tion, development  method,  milestones,  code  size,  code  changes,  code 
structure,  etc.  Modifications  to  the  product  are  repeated  at  site 
installation.  Data  is  detailed,  but  not  beyond  what  can  be  collected. 
Since  it  is  not  repetitive,  it  should  not  be  a burden. 

Similarly,  estimates  of  resource  consumption  (Table  8.18)  are 
reported  in  depth  at  contract  award  and  are  refined  at  PDR  and  CDR. 


; } 


TABLE  8.19 

COMPUTER  RESOURCE  DESCRIPTION  AND  UTILIZATION 


Item 

Notes 

Reporting 

Frequency 

Computer  System  Description 

Tables  8.16  and  8.17 

One  Tine 

System  Utilization 

Hours  per  week  (exclude 
time-share  service) 

Monthly 

System  Availability 

Hours  per  week 

Monthly 

Average  Turnaround  (Batch- 
type) 

Time  out,  minus  time  in, 
minus  execution  time 

Monthly 

95th  Percentile  Turnaround 
^Batch-type) 

Time  out,  minus  time  in, 
minus  execution  time 

Monthly 

Average  Turnaround  (Time- 
share) 

Terminal  time  i cumula- 
tive core  execution  time 

Monthly 

95th  Percentile  Turnaround 
(Time- share) 

Terminal  time  ^ cumula- 
tive core  execution  time 

Monthly 

Note:  Separate  records  should  be  maintained  for  each  computer  system 
(e.g.,  development  versus  maintenance). 
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1 


The  end-item  level  becomes  more  disaggregated  as  the  product  la  further  — 

defined  and  skill  category  data  are  required. 

Actual  Investment  costs  of  facilities,  equipment,  and  computer 
hardware  and  software  packages  are  reported  once,  at  the  time  they  are 
acquired.  Similarly,  a detailed  description  of  the  computer  system  Is 
required,  one  time. 

These  requests,  though  detailed,  are  reasonable  because  of  their 
Infrequency. 

’’or  repetitive  reporting  Items,  we  have  attempted  to  keep  the  data 
as  aggregated  as  possible,  without  losing  significant  Information  content. 

Still,  due  to  the  repetitive  nature,  the  reporting  requirement  Is  an 
Imposition,  and  the  ease  with  which  the  data  can  be  automatically  pro- 
cessed must  be  addressed. 

For  product  status  Information,  requests  during  development  are  ^ 

minimal.  VThen  a source  deck  Is  first  baselined,  size  and  code  structure 
Information  Is  required  (Table  8.2,  F and  H).  If  the  contractor  feels 
this  Is  a burden,  then  the  Air  Force  need  merely  request  a copy  of  the 
source  code  and  a listing,  which  Is  hardly  unreasonable.  Information 
can  then  be  extracted  and  a determination  that  the  code  at  least  compiles 
successfully  can  easily  be  made. 

During  qualification  testing,  we  recommend  that  detailed  Informa- 
tion on  code  changea  be  tracked.  Information  can  be  produced  by  an 
automated  library  system  and  access  should  not  be  a problem  at  this 
stage  of  the  development,  since  Air  Force  representatives  are  present 
during  this  time. 

j 

During  maintenance,  the  product  description  should  be  kept  current 
through  monthly  maintenance  change  reports  (Table  8.3)  and  updates  of  the 
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product  description  (Table  8.2),  when  new  models  are  Incorporated.  The 
data  seem  a reasonable  part  of  configuration  control,  and  should  be 
viewed  In  that  manner. 


I Similarly,  computer  resource  utilization  Information  will  be 

t required  monthly  (Table  8.19).  Reporting  can  easily  be  automated,  since 

tracking  Is  system-wide.  Such  reports  are  currently  being  generated  In 
far  more  detail  at  SDC. 

The  only  difficult  items  to  track  are  cost,  man-hour,  and  computer- 
hour  data  required  in  Table  8.18  at  levels  specified  In  Table  8.14.  The 
resource  elements  themselves  will  probably  not  be  a problem  and  can 
easily  fit  Into  an  already  existing  reporting  code.  The  real  difficulty 
Is  assigning  costs,  man-hours,  and  computer-hours  to  the  CPC  level  for 
some  activities.  This  will  result  In  an  Increase  In  Che  reporting  effort 
required.  However,  it  Is  the  only  method  of  building  up  cost  histories 
on  a standard  list  of  software  components,  so  chat  the  software  costing 
job  can  eventually  be  simplified.  The  estimator  will  eventually  be  able 
to  build  the  total  estimate  from  estimates  of  Che  software  components. 

The  data  base  will  provide  detail  for  component  estimation  and  parametric 

techniques  (as  well  as  analogy  techniques) . Sizing  as  well  as  costing 
should  become  more  a science  and  less  an  art. 

Furthermore,  we  have  carefully  selected  activities  to  be  cracked 
at  the  CPC  level  so  that  artificial  allocations  are  not  required  and 
information  is  technically  easy  Co  track.  Also,  CPCs  have  been  selected 
so  Chat  they  formed  logical  work  packages.  Thus,  we  believe  the  expense 
will  not  be  severe  and  Che  information  will  be  well  worth  Che  cost. 

Contrsetor  cooperation  may  remain  a problem.  Contractors  simply 
may  not  want  Che  Air  Force  to  have  this  much  visibility.  In  response, 
we  suggest  that  the  results  could  be  made  part  of  the  public  literature 
so  that  each  contractor  will  have  the  opportunity  to  Improve  his 
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estimating  techniques.  Of  course,  data  will  have  to  be  aggregated  to 
protect  contractor  confidentiality.  Perhaps  CERs  would  provide  the  best 
means  of  presenting  this  Information. 


Another  question  Is  whether  or  not  there  Is  a convenient  way  of 

reporting  the  data.  Fortunately,  there  Is  an  automated  reporting  system 

42 

available.  It  Is  described  In  AFSCH  173-4.  The  basic  reporting 

41 

elements  for  electronic  systems  are  shown  In  Table  8.20. 

The  Program  Breakdown  Structure  of  Table  8.20  was  Initially  derived 
from  the  Work  Breakdown  Structure  of  Mil  Std.  881A.  It  was  created  to 
record  the  total  cost  of  electronic  systems,  both  hardware  and  software. 

At  the  time  of  Its  creation,  software  was  a minor  cost  item;  hence  only 
one  element,  4210,  Is  solely  devoted  to  software.  Even  this  element 
does  not  Include  all  software  costs.  For  example,  some  software  testing 
costs  are  Included  In  1050  and  some  software  analysis  costs  are  Included 
In  1061. 

John  Glore  of  the  MITRE  Corporation  has  been  modifying  this  report- 
ing system  to  record  costs  of  the  different  software  activities  and  phases 
In  such  a way  that  total  software  costs  could  be  aggregated  under  the 
Program  Breakdown  Code  (PBC)  4200.  He  found  that  the  reporting  structure 

had  room  for  more  levels  than  shown  In  Table  8.20  (which  was  extracted 

41 

from  Glore' s report).  His  suggestions  are  reported  below,  after  which 
we  show  how  our  reporting  system  elements  can  be  related  to  Glore' s 
suggestions. 


Ik 
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TABLE  8.20 


STANDARD  WBS  ELEMENTS  FOR  ELECTRONIC  SYSTEMS 

(Ref.  41,  Table  A2) 

PBC 

Standard  Element  Name 

1 

1000 

Electronic  Syatem 

2 

1010 

Prime  Mission  Product 

3 

1110 

Integration  and  Assembly 

3 

2110 

Sensors 

3 

3110 

Commun lea  t Ions 

3 

4110 

Automatic  Data  Processing  Equipment 

3 

4210 

Computer  Programs 

3 

4310 

Firmware 

3 

5110 

Data  Displays 

3 

6110 

Auxiliary  Equipment 

3 

8110 

Air  Vehicle 

2 

1020 

Training 

3 

1021 

Equipment 

3 

1027 

Facilities 

3 

1029 

Services 

2 

1040 

Peculiar  Support  Equipment  and  Maintenance  (Including 
Maintenance  Concept) 

3 

1041 

Organizational/Intermediate 

3 

1044 

Depot 

3 

1049 

Other 

2 

1050 

Systems  Test  and  Evaluation 

3 

1051 

Development  Test  and  Evaluation 

3 

1053 

Operational  Test  and  Evaluation 

3 

1052 

Combined  DT&E  and  OT&E 

3 

1055 

Mockups 

3 

1056 

Test  and  Evaluation  Support 

3 

1057 

Test  Facilities 

3 

1059 

Other  System  Tests 

2 

1060 

System  Program/Project  Management 

3 

1061 

Systems  Engineering  Management 

4 

1061A 

Reliability 

4 

1061B 

Maintainability 

4 

1061C 

Parts  Control 

4 

1061D 

Nomenclature 

4 

1061E 

Aerospace  Environment 

4 

1061F 

Transportability 

4 

1061G 

Electromagnetic  Compatibility 

4 

1061H 

Radar  Frequency  Management 

4 

1061J 

Security 

4 

1061K 

Survlvabll ity /Vulnerabil ity 

4 

1061L 

System  Safety 
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TABLE  8.20  (Continued) 


Level 

PBC 

Standard  Element  Name 

4 

1061M 

Communications  Long  Lines 

4 

106  IN 

Radio  Frequency  Management 

4 

1061P 

Value  Engineering 

4 

106iq 

Availability 

3 

1062 

Supporting  Project  Management  Activities 

4 

1062A 

Program  Management 

5 

1062AA 

Program/Contract  Work  Breakdown  Structure 

5 

1062AB 

Cost  Information  System 

5 

1062AC 

Cost  Schedule  Systems 

5 

1062AO 

Life  Cycle  Costs 

5 

1062AE 

Schedule  Management 

4 

1062B 

Manufacturing  Management 

4 

1062C 

Configuration  Management 

4 

1062D 

Integration  of  Analysis  and  Related  Computer  Support 

4 

1062E 

Quality/ Inspection 

4 

1062F 

Photographic  Documentation 

4 

1062G 

STINFO 

3 

1063 

Integrated  Logistics  Support 

4 

106  3A 

Preoperatlonal  Supply  Support 

4 

1063B 

Packaging 

4 

1063C 

Transportation 

4 

1063D 

Travel 

4 

1063E 

Maintenance 

4 

1063G 

Limited  Spares/Repair  Parts  Provisioning 

3 

1064 

Crew/Human  Factors 

4 

1064A 

Human  Engineering 

4 

1064B 

Biomedlcal/Llfe  Support  Equipment 

4 

1064C 

Manpower/Personnel  Requirements 

4 

1064D 

Human  Factors  Test  and  Evaluation 

2 

1070 

Data 

3 

1071 

Technical  Publications 

3 

1072 

Engineering  Data 

4 

1072B 

Engineering  and  Configuration  Documentation 

4 

1072H 

Human  Factors 

4 

1072R 

Related  Design  Requirements 

4 

1072S 

System/Subsystem  Analysis 

4 

1072T 

Test 

3 

1073 

Management  Data 

4 

1073A 

Administrative  Management 

4 

1073F 

Financial 

4 

1073L 

Logistic  Support 

4 

1073P 

Procurement /Production 
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TABLE  8.20  (Continued) 


Level  PBC  Standard  Elenent  Naie 


NONE 

1074 

1080 

1081 

1082 

1083 

1084 

1085 
1089 
9200 
NONE 

NONE 

NONE 

NONE 

NONE 

mm 

9600 

NONE 


Support  Data 
Data  Repository 
Operatlonal/Slte  Activation 
Contractor  Technical  Support 
Site  Construction 
Site  Conversion 

System  Assembly,  Installation  and  Checkout  on  Site 
AOP  Support  Facilities 
Other  Support  Facilities 
ComDon  Support  Equipment 

Organlzational/Intermedlate  (including  Equipment 
Common  to  Depot) 

Depot 

Industrial  Facilities 
Cons  t rue  t ion/ Conversion/ Expans ion 
Equipment  Acquisition  or  Modernization 
Maintenance 

Initial  Spares  and  Repair  Parts 

(Specify  by  Allowance  List,  Grouping,  or  Hardware 
Element) 
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Glore’s  Intsrin  Standard  PBCs  have  the  fora:  sA21xxy,  where 

s is  a letter  (A,  B,  ...)  that  Identifies  the  software's 
supplier 

421  identifies  the  element  as  a software  product 

* 

XX  is  an  alphanumeric  code  that  designates  the  software  type 
(for  example  the  CPC  identifier) 

y when  used,  is  a letter  (A-F)  that  identifies  the  life-cycle 
phase  to  which  the  element  applies;  if  y is  not  used,  the 
element  is  presumed  to  encompass  all  phases  covered  by  the 
contract. 

These  PBCs  apply  to  all  the  direct  costs  of  software  development  (or 
the  purchase  or  rental  of  programs).  In  addition,  Glore  defined  the 
following  PBCs  for  software-related  costs,  analogous  to  the  Level  2 
categories  for  hardware-related  costs  in  Table  8.20: 

4220  Software-peculiar  training 

4240  Equipment  required  specifically  for  software  development  or 
maintenance 

4250  Testing  of  software  (Includes  PQT  and  FQT) 

4260  Software-peculiar  management  and  engineering 
4270  Software  documentation 

4285  Software  development  and  maintenance  facilities 

4290  Other  software-related  costs 

4200  Summary  of  all  software-related  costs 

Note  that  these  PBCs  have  only  four  digits  and  are  not  expanded  like  the 
"421"  PBCs. 

The  first  "x"  is  any  alphanumeric  character  but  "I"  or  "0"  and  the 
second  is  a numeral. 


P 
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Glore  also  suggests  adding  three  (or  more)  digits  or  letters  to 

each  element,  after  a "/"  to  designate  the  segment,  functional  area,  CPCI, 

* 

etc.  Thus,  for  example,  42134C/123  would  cover  the  coding  and  checkout 
(life-cycle  phase  C)  of  CPC  34  In  system  segment  1,  functional  area  2, 

CPCI  3. 

Glore  suggests  that  where  a particular  Item  (e.g.,  management) 
Includes  both  software  and  hardware  costs.  It  should  be  allocated  between 
the  corresponding  PBCs.  Consistency  Is  hard  to  maintain  In  this  situation, 
however,  and  we  recomnend  that  only  purely  software  Items,  or  Items  whose 
software  components  can  be  explicitly  separated  and  consistently  tracked, 
be  reported  to  the  software  level.  Other  costs  should  be  collected  as 
system-wide  costs  and  allocated  between  software  and  hardware  with  a 
general  overhead  factor  In  CERs. 

In  general,  we  were  able  to  fit  our  reporting  elements  (Table  8.6) 
Into  Glore 's  framework.  For  direct  labor  man-hours  we  suggest  the 
assignments  shown  In  Table  8.21.  As  can  be  seen,  there  Is  no  category 
for  software  support  personnel  not  directly  chargeable  to  resource 
elements;  we  suggest  defining  a new  category,  4230.  A less  desirable 
alternative  Is  to  charge  these  personnel  to  general  overhead.  We  suggest 
charging  the  costs  of  operating  this  reporting  system  Itself  to  project 
management  (4262A)  and  program  management  documentation  (4273) . The 
testing  PBC  (.425 z)  , we  suggest,  should  Include  only  the  resources 
required  for  explicitly  testing  software;  testing  that  Includes  other 
components  of  the  system  should  not  be  charged  here. 

We  have  made  some  modifications  of  Glore 's  definitions.  First, 
we  use  the  "y"  In  the  PBCs  421xxy  to  designate  resource  elements  Instead 


"Segment"  and  "Functional  Area"  are  more  aggregated  levels  of  the  system 
that  Includes  the  software. 


TABLE  8.21 

EEPORTIMG  CODES  FOR  DIRECT  LABOR  HOURS 
(Resource  Element  Definitions  - Tsble  8.6) 


Resource  Elements 

CPC  LEVEL 

Design 

Coding 

Integration 

Installation 

Maintenance 

("xx"  Identifies  the  particular  CPC 
area,  and  CPCI) 

CPCI  LEVEL 

Design 
Testing 

Independent  V&V 


Interim  PBC 

421xxB/abc 

421xxC/abc 

421xxD/abc 

421xxE/abc 

421xxF/abc 

"abc"  Identifies  the  segment,  functional 

42100B/abc 

425z/abc 


Note;  Devel . Deployment  Operation 

{Test  12  3 

V&V  4 5 6 


SYSTEM  LEVEL 

Analysis 

Testing 

Independent  V&V 
Management 

Engineering 

Configuration 

Project 

Training 

Documentation 

Technical 
Configuration 
Program  Management 
Support 


42100A/ab 

425s/ab  (see  Note  above) 

4261/ab 

4262C/ab 

4262A/ab 

4229/ab 

4272/ab 

4274/ab 

4273/ab 

4230/ab  (or  charge  to  overhead) 
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of  life-cycle  phases  (nllestone  definitions).  The  time  of  reporting 
suffices  to  identify  reported  costs  with  life-cycle  phases.  Our  use 
Is  in  keeping  with  the  activity  definition  of  life-cycle  terms  (l.e. 
analysis,  design,  etc.)  We  have  designated  42100y  to  be  an  aggregated 
category  for  CPCI  level  or  system-level  activities. 

Also,  we  have  added  analysis  and  design  to  the  421  codes,  leaving 
only  engineering  management  reported  against  4261  instead  of  "engineering 
activities"  as  in  Glore's  original  definition. 

Reporting  codes  for  costs  other  than  direct  labor  are  shown  in 
Table  8.22.  Costs  Include  operation  as  well  as  acquisition.  Since 
acquisition  costs  are  also  reported  separately  (Table  8.19),  operating 
and  Investment  costs  can  be  separated  if  desired.  Also,  since  availa- 
bility is  reported  separately,  an  operating  cost  for  the  specified 
system  can  be  derived  as  a function  of  availability. 

Training  facilities  have  been  Included  as  an  item,  although  we  do 
not  expect  separate  software  training  facilities  as  a rule.  This  item, 
too,  will  Include  Investment  as  well  as  operation.  If  new  facilities 
are  not  renuired,  then  a rental  charge  should  be  used. 

One  remaining  (and  important)  item  has  not  been  covered.  Computer- 
hours  should  be  broken  out  in  the  same  manner  as  direct  man-hours.  We 
recommend  reporting  them  under  the  421  PBCs,  distinguished  from  man- 
hours by  using  the  letters  G-L  Instead  of  A-F  in  the  "y"  position. 

In  summary,  we  are  using  the  following  definitions  for  PBCs: 

4210:  Computer  Programs.  All  analysis,  design,  coding,  and  error 
correction  during  testing  and  maintenance.  Also  includes  code  modifica- 
tion for  site-specific  reoulrements.  Costs  of  programs  which  are 
acquired  Instead  of  developed  should  also  be  Included. 
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TABLE  8.22 


REPORTING  CODES  FOR  OTHER  COSTS 


Resource  Elements 

Codes 

Computer  Resources 

Development 

4249A 

Maintenance 

4244A 

Unique  Facilities 

Development 

4285A 

Maintenance 

4285B 

Training 

4227 

Unique  Equipment 

Development 

4249B 

Maintenance 

4244B 

Training 

4221 

A220:  Software-Peculiar  Training.  Preparation  of  material  and 
conduct  of  couraes  to  inatruct  personnel  in  the  operation  of  the  software 
(e.g.,  analyzing  diagnostics).  Also  Includes  demonstration  of  software 
when  demonstration  is  aimed  at  instruction. 

A2A0;  Software-Peculiar  Support  Equipment.  Special  equipment 
required  to  develop  and/or  maintain  the  software.  Equipment  not  required 
for  development  will  be  charged  after  the  beginning  of  Qualification  Test. 
Equipment  required  for  deployment  will  not  be  Included  in  the  category 
even  though  used  in  development;  separate  one-time  report  describing  such 
equipment  should  be  Included  as  a separate  data  item.  Equipment  related 
to  the  computer  resources  will  be  separated  from  other  software-peculiar 
support  equipment.  Operating  costs  will  also  be  reported  here.  Including 
personnel  to  operate  and  maintain  the  computer  center.  The  actual 
facilities,  building,  grounds,  air  conditioning,  etc.  will  be  reported 
under  A285. 

A250;  Testing  of  Software.  Man-hours  and  costs  necessary  for  the 
testing  of  software  after  initial  debugging  of  the  source  deck.  Speci- 
fically excludes  computer  resources  (A2A0)  or  facilities  (A285)  necessary 
to  perform  the  tests.  Also  excludes  redesign  and  recoding  to  correct 
errors  (A210) . Includes  time  necessary  to  plan  and  prepare  test  runs. 
Formulation  of  test  plans  is  also  Included,  but  test  documentation  goes 
under  A270. 

A260!  Software-Peculiar  Management  and  Engineering.  Management 
has  been  divided  into  three  parts:  engineering  (A261)  configuration 
(A262C),  and  project  (A262A).  Engineering  management  costs  will  be 
charged  almost  entirely  by  technical  managers  not  assigned  to  specific 
sections  of  code.  Special  studies  will  largely  be  picked  up  in  analysis 
and  design  (A210).  However,  there  will  still  be  requirements  for 
preparation  and  conduct  of  technical  review  meetings  and  general  tech- 
nical management.  This  will  generally  be  easy  to  separate  from  hardware. 
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Configuration  management  costs  are  those  associated  with  the  control 
of  baselined  software  and  probably  the  maintenance  of  the  library.  They 
exclude  the  writing  of  special  reports,  which  Is  reported  under  4270. 

Items  are  easily  separable  from  hardware. 

Project  management  Includes  financial,  scheduling,  and  other  non- 
technical management  functions.  This  may  be  so  integrated  with  hardware 
that  separation  Is  Impossible.  If  so,  do  not  report  separately  from 
1060.  An  overhead  rate  can  be  applied. 

4270;  Documentation.  This  element  Includes  all  reports  that  can 
be  separately  Identified  with  software.  It  is  separated  into  three  parts: 
technical,  configuration,  and  program  management.  Technical  documentation 
Includes  all  technical  reports  (e.g..  Subsystem  Design  Analysis  Reports, 

Trade  Study  Reports,  Test  Plans,  Test  Procedures,  Test  Results,  User's 
Manuals,  etc.). 

Configuration  documentation  includes  documentation  of-  the  Part  I 
and  Part  II  Specification,  data  base,  special  status  reports,  etc. 

Program  Management  documentation  may  not  include  any  purely  soft- 
ware reports.  Perhaps  the  reporting  system  described  here  will  be 
separated  from  other  Cost  Reports.  If  so.  Its  cost  could  be  shown  here. 

4285;  Development  and  Maintenance  Facilities.  This  element  in-  -i 

eludes  the  acquisition  and  operating  costs  of  special  facilities.  If  ! 

facilities  are  shared,  a rental  charge  will  represent  acquisition.  No 
man-hours  for  facilities  maintenance  and  operation  are  required. 

We  have  now  shown  that  an  automated  reporting  system  is  available 
and  can  be  used  to  collect  required  man-hour,  cost  and  computer-hour 
data  on  a regular  basis.  Even  so,  the  reader  Is  reminded  that  the  11ml-  ^ 

tation  on  what  should  be  reported  is  directly  relsted  to  the  requirement 
for  cooperation  by  contractors.  The  reporting  requirements  are  within 
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contractors'  technical  capability;  however,  their  willingness  may  be 
another  matter.  We  have  tried  to  find  a level  of  detail  which  meets 
Government  goals  and  yet  requires  only  a moderate  addition  to  the 
contractors'  effort. 

8.9  ENGINEERING  CHANGE  PROPOSALS 

A problem  which  has  been  Ignored  to  this  point  is  how  to  handle 
changes  to  requirements  as  represented  by  ECPs  changing  the  Part  I 
Specifications.  They  present  a special  problem  In  that  they  represent 
an  approved  change  to  the  flow  of  normal  development  and  maintenance 
activities.  They  should  be  Isolated  and  their  Impact  separately  measured. 

Thus,  they  should  be  treated  as  a separate  development  with  their  own 
reporting  system. 

During  maintenance,  this  Is  a reasonably  straightforward  task. 

Some  care  has  to  be  taken  In  evaluating  the  results,  however.  First,  as 
discussed  In  Sec.  8.6,  most  analysis  has  taken  place  prior  to  the  formal 
submission  of  the  ECP  (Table  8.13).  Therefore,  few  If  any  analysis  hours 
are  Involved.  Secondly,  non-crltlcal  errors  discovered  during  qualifica- 
tion testing  are  often  treated  as  maintenance  problems  to  be  solved  with 
the  next  ECP.  Third,  management  hours  will  be  hard  to  separate  from  the 
ongoing  maintenance  activities  and,  except  for  configuration  management, 
should  not  be  attempted.  Instead,  they  should  be  treated  as  an  overhead 
Item.  Equipment  and  facilities  have  similar  problems. 

All  of  the  above  understate  the  cost  of  the  ECP,  but  adjustments 
can  be  made  with  overhead  rates  when  comparing  ECPs  to  new  developments. 

As  long  as  what  Is  excluded  is  understood,  no  problem  should  arise. 

It  is  Important  to  note  that  a model  change  (product  Improvement 
contract)  may  close  out  a number  of  ECPs  and  Discrepancy  Report  Forms 
(DRFs).  In  this  case.  It  Is  the  model  change  that  should  be  tracked 
and  not  each  separate  ECP.  However,  all  work  on  a particular  ECP  should 

i 

: 

? 
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be  included  under  only  one  nodel  revision  contract.  Furthermore,  ^ 

j reporting  should  separate  error  corrections  (DRFs)  from  product  Improve- 

! ment  (ECPs),  making  it  possible  to  separate  maintenance  from  the  model 

' change. 

* 

\ 

I 

, During  development,  the  ECP  problem  is  much  more  complicated  because 

the  impact  of  the  ECP  does  not  have  a specific  end  date  with  the  delivery 
of  a product.  Instead,  the  ECP  will  be  integrated  into  the  development 
program,  affecting  some  completed  tasks,  causing  work  to  be  modified  or 
redone,  and  to  a lesser  extent  affecting  tasks  not  initiated,  by  changing 
{ requirements  or  scope. 

i 

I 

We  recommend  that  each  ECP  be  tracked  separately  until  it  is 
I totally  Integrated  with  the  rest  of  the  development.  This  can  be  done 

by  identifying  the  portion  of  the  software  made  obsolete  by  the  require- 
ment change  and  estimating  the  resources  required  to  replace  the 
I obsolete  portion  and  add  the  new  capability. 

i ^ 

Cost  (man-hours  and  computer-hours)  should  then  be  tracked  until 
the  work  has  been  redone.  Thus,  integration  and  testing  for  the  ECP 
would  not  be  reported  if  the  ECP  were  Introduced  during  design  and  the 
changes  were  incorporated  prior  to  module  baselining. 

8.10  PRE-CONTRACT  ACTIVITIES 

One  final  point  should  be  made  that  will  be  useful  when  using  the 
results  of  this  reporting  system.  A development  is  often  made  up  of 
a series  of  contracts,  not  Just  one  (for  full-scale  development).  When 
the  history  of  the  development  of  the  total  defense  system  is  finally 
put  together,  all  contracts,  with  their  separate  costs  and  histories,  ^ 

will  have  to  be  Integrated  for  comparative  purposes. 

Thus  not  only  should  the  full-scale  development  contractor  be  ^ 

reported,  but  so  should  the  Independent  V&V  contractor,  as  well  as  the 
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concept  formulation  and  validation  contractors.  If  software  analysis  has 
been  Included  in  those  phases  or  computer-program  prototypes  developed. 


It  should  be  noted  that  the  automated  reporting  system  discussed 
In  AFSCM  173-4  has  provision  for  multiple  contractors.  Different  letters 
are  used  at  the  front  of  the  code.  Thus,  the  code  is  actually  s421jacy/ 
ABC,  where  s designates  the  supplier. 


\ 


8-81 


In  Che  pant  aonCha  GRC  haa  atudled  Cha  problau  of  aoftwara 


cost  estimation,  hypothesizing  relationships,  gathering  and  analyzing 
data,  and  examining  reporting  systems.  In  this  section,  significant 
contributions  to  the  improvement  of  software  cost  estimating  and  data 
collection  are  summarized  (Sec.  9.1).  The  subject  Is  large  and  much 
remains  to  be  done.  Our  recommendations  for  future  work  are  presented 
In  Sec.  9.2. 

9.1  CONCLUSIONS 

Significant  findings  of  this  study  are  organized  around  the  two 
major  topics  of  the  study:  (1)  the  definition  of  resource  elements, 
and  (2)  the  development  of  cost  estimating  relationships. 

In  Sec.  8,  our  recommendations  for  resource  data  elements  are 
presented.  They  call  for  the  regular  collection  of  data  on  costs, 
man-hours,  computer-hours,  and  completion  status.  Other  information 
Is  also  specified,  such  as  a detailed  description  of  the  software 
product,  the  characteristics  of  the  computer  system  used  to  develop 
or  maintain  the  software,  etc.  We  recommend  collection  of  this  latter 
information  at  specified  milestones  only.  Its  level  of  detail  is 
much  greater  than  that  required  of  the  regularly  reported  data;  however, 
the  infrequency  of  reporting  should  ease  the  burden  on  the  contractor. 

Important  Innovations  in  the  resource  element  definitions  Include 
the  following: 


measurement  of  progress  towards  software  completion.  This  milestone 
separates  coding  and  checkout  from  test  and  Integration.  Each  time  a 
source  deck  is  completed  (coded  and  debugged),  the  deck  is  baselined 
and  future  work  on  that  part  of  the  program  will  be  recorded  against 
test  and  Integration.  Thus,  work  In  these  two  phases  will  go  on  con- 
currently and  data  will  be  continuously  available  on  the  percentage  of 
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the  product  which  has  finished  Initial  coding  and  checkout  and  Is  in 
the  teat  and  Integration  phase. 

2.  No  requlreiaent  to  separate  redesign  and  recoding  efforts  in 
the  Test  and  Integration  Phase.  We  speculate  that  this  requirement 

in  prior  cost  reporting  systems  has  been  one  of  the  reasons  for  their 
non- implementation.  Redesign  and  recoding  in  this  period  are  often  done 
by  the  same  person  and  therefore  difficult  to  separate.  If  separate 
reporting  were  required,  it  would  probably  be  artificial,  so  the  infor- 
mation content  is  suspect  at  any  rate. 

3.  A preliminary  standardized  list  of  end  items  (CPCs)  for 
comparing  resource  consumption  among  software  developments.  This  list 
will  serve  to  disaggregate  the  software  product  into  meaningful  parts. 
Ultimately,  software  will  be  described  by  its  parts,  with  estimates 
made  at  the  parts  level  and  then  aggregated,  much  as  hardware  estimates 
are  made  today.  The  list  is  preliminary  in  that  we  are  confident  that 
additions  will  be  made  upon  review  by  the  software  community. 

Separation  of  maintenance  into  error  response  and  product 
improvement.  Only  error  response  should  be  associated  with  the  original 
software  development.  Product  improvement  (through  ECPs)  should  undergo 
separate  approval  and  be  subject  to  the  same  data  gathering  as  any  other 
development . 


5.  Identification  of  differences  in  "milestone"  and  "activity*' 
for  Analysis,  Design,  Coding,  and  Testing.  The  inconsistent 
use  of  these  terms  has  been  a source  of  confusion  during  the  entire 
effort,  and  undoubtedly  in  other  data  gathering  and  analysis  attempts. 
Ultimately,  different  terms  for  the  milestone  and  activity  uses  should 
probably  be  selected. 
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In  addition  to  defining  the  data  elenents,  we  have  demonstrated 
one  way  of  mapping  these  elements  into  an  already  existing  reporting 
system.  Hence  the  mechanics  of  data  retrieval  will  not  be  severe. 

In  summary,  we  believe  these  innovations  will  Improve  the  chances 
that  contractors  will  accept  a reporting  system,  without  sacrificing  the 
needs  of  the  Program  Offices  for  cost  control  data  and  the  Air  Force's 
long-term  needs  for  better  data. 

Our  second  major  objective  was  to  develop  Improved  software  cost 
estimating  relationships.  A significant  amount  of  work  had  been  pre- 
viously devoted  to  this  task  and  reported  in  the  literature.  This 
previous  work  was  performed  by  competent  groups  and  focused  on  esti- 
mating total  man-hours  or  costs.  Results  had  been  disappointing,  with 
derived  relationships  exhibiting  large  variances. 

As  a result,  we  were  directed  to  take  a new  approach  and  focus  our 
attention  on  the  development  of  estimating  relationships  for  each  of  the 
life-cycle  phases.  Our  efforts  have  been  concentrated  on  man-hour  esti- 
mation. During  the  contract,  we  first  hypothesized  relationships.  We 
then  identified,  collected,  and  analyzed  data.  Important  findings  in- 
clude the  following. 

1.  Accurate  estimating  relationships  for  each  life-cycle  phase 
cannot  be  developed  independent  of  the  other  phases.  Data  when  plotted 
exhibits  significant  variation.  As  a consequence,  an  estimate  of  total 
man-hours  made  by  estimating  each  life-cycle  phase  and  then  adding  will 
have  less  precision  than  that  made  on  the  total. 

2.  Estimating  the  trade-offs  between  the  life-cycle  phases  Is 
of  prime  Importance.  Only  then  will  the  variation  In  the  total  man-hour 
estimate  be  reduced,  l.e.,  the  accuracy  of  the  estimating  relationship 
be  Increased.  In  effect  covariances  must  be  Incorporated  In  the  esti- 
mators. Reducing  the  risk  In  software  cost  estimates  is  dependent  on 
quantifying  these  interrelationships  between  the  life-cycle  phases. 
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3.  Estimating  the  trade-offs  can  also  lead  to  the  development 
of  rules  for  optimal  allocation  of  resources  among  life-cycle  phases. 
Quantifying  the  trade-offs  between  life-cycle  phases  makes  It  possible 
to  derive  such  rules.  Application  of  the  rules,  in  turn,  will  lead  to 
less  costly  or  lower-risk  developments.  In  effect,  these  rules  will 
represent  estimating  relationships  for  software  developments  with  an 
efficient  allocation  of  resources.  Note  that  we  are  In  effect  assert- 
ing that  much  of  the  variance  in  previous  estimating  techniques  is  the 
result  of  mixing  software  developments  which  have  inefficient  as  well  as 
efficient  allocations  of  man-hours  (among  life-cycle  phases)  in  the 
same  data  base. 

Although  the  data  collected  was  insufficient  to  quantify  these 
trade-offs,  we  were  able  to  locate  a data  base  (PARMIS)  that  offers 
promise.  Detailed  man-hour  data  has  been  maintained  in  PARMIS  for  2,000 
projects.  The  development  environment,  language,  and  application  areas 
are  all  similar,  so  that  these  significant  parameters  will  not  cloud  the 
search  for  trade-off  relationships. 

In  addition  to  the  life-cycle  phase  work  we  have  examined,  in 
aggregate,  a number  of  basic  relationships  which  can  be  incorporated 
into  today's  estimating  techniques.  Among  those  examined  are: 

1.  The  similarity  between  the  development  processes  for  "ADP" 
and  defense-system  software.  It  is  our  belief  that  a model  which  explains 
ADP  software  costs  will  also  be  a good  predictor  of  defense-system 
software  costs.  The  coefficients  of  the  model  will  most  certainly  change, 
but  the  model  forms  can  be  the  same  because  the  procedures  are  similar. 
Therefore,  we  examined  the  relationship  of  man-hours  to  lines  of  source 
code  for  a mixture  of  software  developments,  stratifying  on  type.  Defense- 
system  programs'  were  more  expensive,  but  followed  the  same  trend  as 
ADP  programs. 


2.  The  relationships  of  aggregate  software  cost  to  key  driving 
variables  such  as  size,  language,  space  constraint,  and  requirements 
changes  (ECPs)  during  development.  Results  are  given  in  Sec.  6.3. 

3.  The  relationships  between  different  measures  of  software 
product  such  as  lines  of  source  code,  size  of  Part  II  Specifications, 
and  number  of  object  instructions.  Conversion  ratios  between  object 
and  source  code  for  different  languages  and  machines  were  developed. 

4.  The  magnitude  of  maintenance  activities  and  the  separation 
of  error  correction  from  product  improvement.  Examples  were  given  in 
which  product  Improvement  was  the  major  consumer  of  resources  during 
maintenance.  Relationships  for  estimating  error  rate  were  derived,  and 
the  difference  in  maintenance  manning  requirements  for  source  and  object 
programs  was  quantified. 

9.2  RECOMMENDATIONS  FOR  FUTURE  WORK 

Based  upon  this  study,  we  believe  the  following  efforts  will  best 
contribute  to  the  complicated  job  of  software  cost  estimation. 

First,  and  of  utmost  Importance,  implement  a cost  reporting  system. 

3 

If  the  SDC  system,  developed  in  1967,  had  been  Implemented,  we  would  have 
had  a usable  cost  data  base  today.  We  hope  our  suggested  elements  are 
the  basis  for  the  approved  cost  reporting  system,  as  we  believe  they 
avoid  some  of  the  impediments  to  contractor  acceptance,  and  offer  a 
great  deal  of  control  for  the  Program  Offices.  However,  the  important 
thing  is  to  Implement  some  system.  Furthermore,  we  agree  with  Devenny^ 
that  use  of  the  reporting  system  must  be  a contractual  requirement. 


Second,  we  recommend  that  the  Air  Force  continue  the  study  of  the 
PARMIS  data  base  begun  in  this  contract.  PARMIS  is  the  only  readily 
available  data  base  in  which  the  trade-offs  between  phases  can  be 
examined.  Although  our  results  to  date  are  quantitatively  inconclusive, 
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|(  due  to  the  small  sample  size  extracted  from  that  data  base,  the  existence 

s 


of  the  trade-offs  has  been  demonstrated.  Furthermore,  the  fact  that 
PARMIS  Is  uniform  with  respect  to  major  cost  driving  parameters  such 
as  language,  applications,  etc.,  and  the  detailed  Information  available 
on  the  product  developed,  offer  a real  opportunity  to  quantify  these 
interrelationships. 

The  development  process  for  ADP  software  at  AFDSDC  is  similar  to 
that  for  major  defense  system  software.  Therefore,  the  functional  forms 
of  the  relationships,  if  not  the  relationships  themselves  (with  some 
scaling  rules),  can  be  used  for  defense  system  applications.  It  is  most 
important  to  quantify  these  relationships  because  they  are  the  key  to 
identifying  proper  resource  allocations  among  the  life-cycle  phases 
and  thereby  achieving  risk  reduction  in  software  developments. 

Third,  and  concurrently,  a systematic  collection  of  data  from 
completed  projects  should  be  initiated.  Although  this  is  expensive  and 
time-consuming,  we  cannot  afford  to  wait  10  to  15  years  until  the  data 
from  a reporting  system  implemented  today  is  usable  for  cost  estimation. 


i 

I 

I 

! 
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SAMPLE  PARMIS  QUESTIONNAIRE 

1.  Project  Originator  Number  (PON) 

2.  Project  Description 


3.  Size  of  system  described  by  PON: 

I 

• Computer  words  of  instructions 

• Computer  words  of  instructions  and  data 

• Number  of  relocatable  object  instructions 

4.  Computer  System  (Manufacturer  and  Model): 

• Development 

• Operation^ 

5.  Computer  Word  Size: 

• Development 

• Operation 

6.  Constraints  (space,  time,  core,  peripheral  storage,  etc): 

• Development 

• Operation ] 

7.  Date  baseline  established: 

• Functional  Specs 

• Design  Specs 

• Source  Programs 

i 
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Programming  Languages: 


Language 


Lines  of  Source  Code 


Assembler  and  Compiler: 

• Was  the  assembler  hosted  by  the  target  machine? 

• Was  the  compiler  hosted  by  the  target  machine? 

Development  Methods  Employed: 

• Top  Down  Design 

• Chief  Programmer  Teams 

• Structured  Programming 

• Software  Development  Libraries 

• Other 

Software  Development: 

• What  % of  software  was  based  on  requirements  adapted  from 

an  existing  system? 

• What  % of  software  was  based  on  design  adapted  from  an 

existing  system? 

• What  % of  code  was  adapted  from  existing  code  (e.g.,  trans- 
lated from  another  language)? 

• What  % of  code  already  existed? 

• What  special  hardware  and/or  software  items  were  provided 

specifically  to  support  the  development  of  the  software 
(e.g.,  purchase  of  development  software)? 

Cost? 


12.  Verification  and  Validation  (V&V)  tools: 

a Was  a separate  V&V  organization  used? 

• Were  automated  V&V  tools  used  by: 

The  developing  organization 

A V&V  organization 

13.  Software  Installation: 

• Number  of  installation  sites 

• Number  of  subsystems  requiring  modifications,  by  site 

• Time  required  for  Installation,  by  site 

• Total  cost  of  Installation,  by  site 

• Total  man-hours  for  installation,  by  site 

• Total  computer  hours/dollars  for  installation,  by  site 

14.  Operation: 

• Number  of  man-hours /month  of  the  operational  lifetime  required 

to  operate  the  software  component  of  the  system  (l.e.,  not 
users,  but  operators) 

a Cost  per  man-hour 

• Planned  system  operational  lifetime 

15.  Maintenance: 

• Number  of  hours  per  operational  month  for  maintenance: 

Functional  Analyst 

Data  Systems  Analyst 

Programmer 

Support  Personnel 

9 Computer  hours/dollars  for  maintenance  for  each  month  of 
operational  lifetime 


1 


1 


i 

j 


i 
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Software  Faults: 


a For  each  software  fault  requiring  nalntenance  action: 

Date  reported 

Date  closed 

Lines  of  Code 

Was  documentation  changed? 

Priority  of  fix  activity 

Man-hours 

Computer  hours 

Software  Changes: 

e For  each  formal  change  order  affecting  software  during 
development : 

One-line  description 

Date  filed 

Date  approved 

Date  work  started 

Agency  originating  change 

e Fot  each  subsystem  affected  by  the  change: 

Name 

Were  functional  specs  changed? 

Were  design  specs  changed? 

How  does  the  change  affect: 

s Work  already  completed? 

s Work  remaining? 

How  were  milestones  affected? 


18.  Computer  Time: 


Cumulative  computer  time  at  each  major  milestone: 

Design  completion 

Coding  and  debugging  completion 

Unit  testing  completion 

Integration  testing  completion 

Environmental  Systems  Test  I completion 

Environmental  Systems  Test  II  completion 


Fault  Isolation: 


Number  of  man-hours  per  month  of  operational  lifetime 
required  to  Isolate  faults  within  the  system 

Cost  per  man-hour 

Total  number  of  anomalies  reported  by  month 

Fraction  of  reported  anomalies  ultimately  attributed  to 
software,  reported  by  month 

For  each  fault  Isolated: 

Date  reported 

Date  identified 

Action  required  to  fix: 

• Low  priority  software  fixes 

• High  priority  software  fixes 

• System  change  notices 

• No  action 

Date  corrected  
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20. 


Docufflentat Ion : 


• Number  of  pages  of  functional  specifications  devoted  to  this 

system  or  subsystem  when  project  Initially  approved 

e Dates  and  affected  numbers  of  pages  of  the  functional  specifica- 
tions for  each  change  during  development 

e Number  of  pages  of  first  design  specifications  generated  for 

this  system  when  approved  or  baselined 

e Dates  and  affected  numbers  of  pages  of  each  subsequent  change 
to  the  design  specifications 

e Number  of  pages  of  programmer  documentation  prepared  for  this 
system  when  first  released^ 

e Dates  and  affected  number  of  pages  of  all  changes  to  programmer 
documentation  prior  to  system  release 

s Number  of  pages  of  user  documentation  prepared  for  this  system 
when  first  released 

s Dates  and  affected  numbers  of  pages  for  all  changes  to  user 

documentation  prior  to  system  release 
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APPENDIX  C 


JOB  DESCRIPTIONS 


C.l  MANAGER 

Managers  direct  and  control  the  performance  of  technical  staff 
so  as  to  satisfy  the  customer's  technical  and  contractual  requirements 
within  the  project  budget.  They  require  the  skills  to  perform  the 
following  functions: 

• Specify  and  schedule  tasks  to  fulfill  project  requirements. 


Reconmend  and  negotiate  for  technical  staff  to  perform  the 
tasks. 


Direct  technical  and  cost  performance  of  the  project  team 
to  ensure  meeting  the  project's  goals  and  objectives. 


Monitor  the  preparation  of  reports  and  briefings  required 
by  contract  or  requested  by  customer. 


In  some  cases,  serve  as  a task  leader  or  technical  contri- 
butor. 


Managers  are  classified  according  to  level  of  experience,  expertise,  and 
management  skills  as  follows: 


1. 


Project.  Administratively  directs  and  controls  an  entire 
project.  Usually  a "pure"  manager  who  makes  little  or 
no  technical  contribution. 


3. 


Technical.  Technically  directs  and  controls  an  entire  pro- 
ject. Concerned  with  those  technical  aspects  of  the 
project  that  Influence  management  decisions. 

Configuration.  Ensures  "quality  control"  of  the  software 
product.  Manages  the  process  of  baselining  specifications 
and  the  source  code  as  It  progresses  through  the  phases  of 
the  system  life  cycle. 
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Managers  may  be  selected  from  most  of  the  other  Job  categories 
identified  In  this  section,  depending  upon  the  specific  requirements  of  the 
proj ect. 

C.2  SCIENTIST  AND  ENGINEER 

These  persons  perform  basic  and  applied  research;  participate  In 
tie  analysis  and  design  phases  of  system  development;  and  assist  In  the 
evaluation  of  systems.  No  attempt  Is  made  here  to  differentiate  between 
specialties  (e.g.,  electronics,  propulsion  systems,  communications).  We 
simply  identify  a general  category  of  scientists  and  engineers  who 
possess  the  skills  to  perform  the  following  functions: 

e Develop  general  system  concepts  and  procedures. 

e Specify  formulas  and  algorithms  to  be  Implemented. 

• Design,  develop,  and  test  prototypes  of  critical  elements. 

e Analyze  available  equipment  to  evaluate  Its  suitability. 

e Provide  technical  guidance  to  persons  responsible  for 

developing  the  software  required  to  support  the  operation 
of  the  system. 

Scientists  and  engineers  are  classified  according  to  level  of 
experience,  expertise,  and  management  skills  as  follows: 

1.  Senior  Scientist  or  Engineer.  Directs  and  coordinates  all 
technical  activities  necessary  to  complete  a single  large 
scientific  or  engineering  project,  or  several  smaller 
projects. 

Scientist  or  Engineer  A.  Directs  a major  technical  portion 
of  a large  project,  or  an  entire  project  of  lesser  complexity 
than  those  normally  assigned  to  a senior  scientist  or 
engineer.  May  also  supervise  other  technical  personnel,  and 
may  be  responsible  for  the  administrative  duties  of  a small 


work  group,  although  the  technical  work  performed  la  more 
Important  than  the  supervision. 

3.  Scientist  or  Engineer  B.  Analyzes  complex  scientific 
or  engineering  problems  with  minimal  supervision  and 
guidance. 

4.  Scientist  or  Engineer  C.  Analyzes  scientific  or  engineering 
problems  under  the  Immediate  supervision  of  a higher-level 
person. 

C.3  SYSTEMS  ANALYST 

Systems  analysts  require  the  technical  skills  to  perform  the 
following  functions: 

• Confer  with  officials,  scientists,  and  engineers  to  facili- 
tate understanding  of  the  operational  needs  and  goals  of 

a new  application  or  refinement  of  an  existing  application. 

• Formulate  a comprehensive  statement  of  a problem  and  develop, 
review,  and  recommend  methods,  procedures,  and  systems  to 
solve  It. 

• Develop  specifications  for  computer  hardware  and  software. 
Including  block  diagrams,  flow  charts,  record  and  form 
layouts,  etc. 

• Verify  that  the  software  design  meets  each  of  the 
operational  needs  and  goals. 

• Document  the  final  design  In  a manner  suitable  for  use  In 
programming. 

• Conduct  follow-up  studies  (as  needed)  to  ensure  that  the 
evolution  of  the  software  during  the  development  process 
has  not  altered  the  design  to  the  point  that  It  no  longer 
satisfies  system  requirements. 
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Systeas  analysts  are  classified  according  to  level  of  experience, 
expertise,  and  management  skills  as  follows: 

1.  Senior  Systems  Analyst.  Analyzes  and  evaluates  potential 
new  software  applications,  refines  existing  applications, 
formulates  statements  of  problems  and  devises  solutions, 
prepares  general  block  diagrams,  and  supervisee  systems 
analysts  in  the  development  of  assigned  portions  of  the 
software  design. 

2.  Systems  Analyst  A.  Conducts  a significant  portion  of  the 
applications  analysis  and  resulting  system  design,  including 
development  of  system  hardware  and/or  software  specifica- 
tions, data  verification  methods,  etc.  May  also  assist  In 
supervising  other  systems  analysts. 

3.  Systems  Analyst  B.  Independently  develops  assigned  portions 
of  the  system  design.  Including  block  diagrams,  flow  charts, 
record  layouts,  and  other  portions  of  the  system  design 
documentation. 

A.  Systems  Analyst  C.  Conducts  analyses  of  a less  complex 
nature  and.  In  general,  provides  assistance  on  specific 
tasks  as  directed  by  other  members  of  the  systems  analysis 
team. 

C.4  SYSTEMS  PROGRAMMER 

Systems  programmers  require  the  skills  to  perform  the  following 
functions: 

s Conceive,  code,  test,  modify,  and  maintain  the  systems  pro- 
grams that  carry  out  the  internal  functions  of  a digital 
computer. 


• Develop  extremely  complex  systems  software  such  as  operating 
systems,  sophisticated  file  management  routines,  large 
telecoiimunlcatlons  networks,  advanced  mathematical  and 
scientific  software  packages,  multiprogramming  routines, 
compilers,  link  editors,  and  assemblers. 

• Develop  moderately  complex  systems  software  such  as  utili- 
ties, job  control  language,  macros,  subroutines. 

e Install  vendor-supplied  utilities,  application  packages, 
and  engineering  releases. 

Systems  programmers  are  classified  according  to  level  of  experience,  ex- 
pertise, and  management  skills  as  follows: 

1.  Senior  Systems  Programmer.  Develops  extremely  complex  sys- 
tems software  such  as  an  entire  operating  system  or  complex 
subsystems.  May  also  provide  significant  technical  direction 
to  lower-level  personnel. 

2.  Systems  Programmer  A.  Assists  In  the  development  and  modifi- 
cation of  extremely  complex  systems  software  as  defined  by 
higher-level  personnel. 

3.  Systems  Programmer  B.  Assists  In  the  definition  and  program- 
ming of  moderately  complex  systems  software,  as  well  as 

some  portions  of  extremely  complex  systems  software, 

4.  Systems  Programmer  C.  Assists  In  coding  and  maintaining 
systems  software  of  moderate  complexity,  program  libraries, 
and  technical  manuals  under  the  technical  direction  of 
higher-level  personnel. 

C.5  APPLICATIONS  PROGRAMMER 

Applications  programmers  require  the  skills  to  perform  the  follow- 
ing functions: 


• Assist  other  menbers  of  the  project  team  in  analyzing 

and  defining  specific  applications,  methods  of  approach, 
specifications  or  requested  modifications  of  software, 
and  correction  of  software  errors. 

e Logically  analyze  a problem,  prepare  block  diagrams  and 

flow  charts,  and  code  progr2un8,  including  the  partitioning 
of  software  systems  Into  individual  modules. 

e Check  programs  for  logical  sequence,  perform  test  runs, 
compare  test  results  against  known  solutions,  determine 
causes  of  software  malfunctions,  and  make  corresponding 
corrections,  as  needed. 

e Document  the  developed  software  In  the  form  of  program 

descriptions,  block  diagrams,  flow  charts,  listings,  test 
data,  test  results,  etc. 

Application  programmers  are  classified  according  to  level  of 
experience,  expertise,  and  management  skills  as  follows: 

1.  Senior  Applications  Programmer.  Provides  assistance  (as 
needed)  in  analyzing  and  evaluating  software  applications; 
assists  and  supervises  programmers  In  the  development  and 
modification  of  applications  software. 

2.  Applications  Programmer  A.  Applies  programming  techniques 
and  mathematical  theories  In  the  development  and  modification 
of  applications  software. 

3.  Applications  Programmer  B.  Independently  prepares  short 
applications  programs,  or  prepares  portions  of  more  complex 
applications  programs  in  accordance  with  specifications. 

4.  Applications  Programmer  C.  Assists  In  the  preparation  of 
short  applications  programs  or  portions  of  more  complex  appli- 
cations programs.  In  accordance  with  specifications  and 
specific  instructions. 
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C.6  PROGRAM  LIBRARIAN/ SECRETARY 

Program  llbrarlan/secretarles  require  the  skills  to  perform  the 
following  functions: 

e Maintain  a program  library  which  contains  all  source  code, 
relocatable  modules,  linkage-editing  statements,  object 
modules.  Job  control  statements,  test  Information,  etc. 

e Maintain  current  listings  of  the  essential  portions  of  the 
library  as  appropriate  for  each  programmer. 

• Update  the  library. 

s Retrieve  modules  for  compilation,  run  jobs,  and  store 

results.  Including  those  for  test  runs. 

• Produce  library  status  listings. 

Program  llbrarlan/secretarles  are  classified  according  to  level 
of  experience,  expertise,  and  management  skills  as  follows: 

1.  Senior  Program  Librarian/Secretary.  Directs  and  coordinates 

all  activities  associated  with  the  program  library.  Including 
the  Instruction  of  personnel  regarding  operating  procedures 
and  assignments. 

2.  Program  Librarian.  Secretary  A.  Maintains  the  program 
library  and  reports  Its  status  to  the  Immediate  supervisor. 

3.  Program  Librarian/Secretary  B.  Maintains  portions  of  the 
program  library  for  particular  programmers. 

4.  Program  Librarian/Secretary  C.  Submits  decks  for  compilation 
and/or  subsequent  execution,  enters  test  data,  forms  the 
data  base,  etc. 
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C.7  DATA  ENTRY  OPERATOR 


Data  entry  operators  require  the  skills  to  perform  the  following 
functions: 

e Operate  data  entry  devices  such  as  keypunch,  key-to-tape,  ^ 

and  key-to-dlsk. 

e Verify  entered  data. 

e Perform  related  clerical  tasks. 

Data  entry  operators  are  classified  according  to  level  of  experience, 
expertise,  and  management  skills  as  follows: 

1.  Senior  Data  Entry  Operator.  Supervises  data  entry  operators, 
schedules  their  assignments  and  activities.  Instructs 
operators  on  procedures  used  to  perform  assignments,  and 
trains  new  operators  assigned  to  the  project. 

2.  Data  Entry  Operator  A.  Operates  data  entry  devices  In 
recording  a variety  of  data.  Instructs  operators  on  proce- 
dures used  to  perform  routine  assignments,  and  assists  In 
training  new  operators,  with  minimal  supervision. 

3.  Data  Entry  Operator  B.  Operates  date  entry  devices  in  re- 
cording  a variety  of  data,  verifies  entered  data,  and  performs 
related  clerical  duties,  under  direct  supervision. 

A.  Data  Entry  Operator  C.  Performs  data  entry  on  a volume 

basis  with  only  one  or  two  Input  formats  and  a single  type 
of  data  entry  device,  under  direct  supervision.  Hay  also 
occasionally  assist  In  other  related  types  of  work. 
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TABLE  C.l 


1975  SALARY 

LEVELS  FOR 

EACH  SKILL  CATEGORY 

k 

Skill  CateRory 

Hourly  Kates 

I 

Low 

1st  Q 

Avg 

3rd  Q 

High 

Hanagement  Peraonnel:^ 

Project 

$11.78 

$16.87 

$20.48 

Technical 

12.02 

15.29 

17.69 

2 

Configuration 

— 

— 

— 

3 

Scientists  and  Engineers: 

Senior 

$13.62 

Level  A 

12.84 

Level  B 

10.18 

Level  C 

6.60 

Systems  Analysts: 

Senior 

$ 5.10 

$ 8.20 

$ 9.45 

$10.55 

$15.50 

: ! 

Level  A 

4.40 

7.10 

8.18 

9.13 

13.40 

i 

j 

Level  B 

3.90 

6.30 

7.25 

8.10 

11.90 

Level  C 

3.40 

5.50 

6.33 

7.08 

10.38 

4 

Systems  Programmers: 

Senior 

$ 5.28 

$ 7.88 

$ 8.93 

$ 9.85 

$15.15 

1 

1 

Level  A 

4.63 

6.90 

7.83 

8.65 

13.30 

1. 

Level  B 

3.80 

5.65 

6.40 

7.08 

10.90 

Level  C 

3.35 

4.98 

5.65 

6.23 

9.60 

4 

Applications  Programmers : 

Senior 

$ 4.15 

$ 6.78 

$ 7.70 

$ 8.45 

$14.15 

\ 

Level  A 

3.65 

5.98 

6.80 

7.45 

12.48 

Level  B 

3.20 

5.25 

5.98 

6.55 

10.98 

Level  C 

2.83 

4.63 

5.28 

5.78 

9.68 

•N 

Program  Librarians /Secretaries:^ 

Senior 

$ 5.94 

Level  A 

5.31 

\ 

Level  B 

4.96 

Level  C 

4.56 
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TABLR  C.l  (Continued) 


Hourly  Rates 


Skil  Category 

Low 

Ist  Q 

Avg 

3rd  0 

High 

4 

Data  Entry  Operators: 

Senior 

$2.65 

$3.80 

$4.43 

$4,90 

$8.60 

Level  A 

2.35 

3.35 

3.93 

4.35 

7.60 

Level  B 

2.10 

3.00 

3.53 

3.90 

6.83 

Level  C 

1.85 

2.68 

3.10 

3.45 

6.05 

^Figures  taken  from  private  nationwide  survey  of  R&D  companies  for  1975. 

2 

Information  for  this  skill  category  Is  currently  not  available. 

2 

Figures  taken  from  Battelle's  nationwide  survey  of  aerospace  scientists 
and  engineers  In  R&D  for  1975. 

figures  taken  from  Hansen's  Weber  nationwide  survey  of  data  processing 
personnel  for  1975. 

^Figures  taken  from  private,  soutbern-Callfomla  survey  of  computer 
science  secretaries  for  1975. 
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